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NOVEL HUMAN GENES AND GENE EXPRESSION PRODUCTS 

FIELD OF THE INVENTION 

The present invention relates to novel polynucleotides of human origin 

and the encoded gene products. 

5 BACKGROUND OF THE INVENTION 

Identification of novel polynucleotides, particularly those that encode an 
expressed gene product, is important in the advancement of drug discovery, diagnostic 
technologies, and the understanding of the progression and nature of complex diseases 
such as cancer. Identification of genes expressed in different cell types isolated from 
10 sources that differ in disease state or stage, developmental stage, exposure to various 
environmental factors, the tissue of origin, the species from which the tissue was 
isolated, and the like is key to identifying the genetic factors that are responsible for the 
phenotypes associated with these various differences. 

This invention provides novel human polynucleotides, the polypeptides 
15 encoded by these polynucleotides, and the genes and proteins corresponding to these 
novel polynucleotides. 

SUMMARY OF THE INVENTION 

This invention relates to novel human polynucleotides and variants 
thereof, their encoded polypeptides and variants thereof, to genes corresponding to these 

20 polynucleotides and to proteins expressed by the genes. The invention also relates to 
diagnostics and therapeutics comprising such novel human polynucleotides, their 
corresponding genes or gene products, including probes, antisense nucleotides, and 
antibodies. The polynucleotides of the invention correspond to a polynucleotide 
comprising the sequence information of at least one of SEQ ID NOs: 1-3351 . 

25 Various aspects and embodiments of the invention will be readily 

apparent to the ordinarily skilled artisan upon reading the description provided herein. 

DETAILED DESCRIPTION OF THE INVENTION 

The invention relates to polynucleotides comprising the disclosed 
nucleotide sequences, to full length cDNA, mRNA genomic sequences, and genes 
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corresponding to these sequences and degenerate variants thereof, and to polypeptides 
encoded by the polynucleotides of the invention and polypeptide variants. 

Polypeptide variants differ from wild type protein in having one or more 
amino acid substitutions that either enhance, add, or diminish a biological activity of the 

5 wild type protein. 

Six of the polypeptides disclosed herein encode new members of the MKK 
kinase family; the coding region is found within the nucleotide region in parentheses: SEQ 
ID NO:29 (nucleotides 295-421); SEQ ID NO:31 (298-397); SEQ ID N0.196 (37-322); 
SEQ ID NO:3175 (nucleotides 14-164); SEQ ID NO:3190 (229-390); and SEQ ID 

10 NO:3281 (15-182). Twenty-four of the polypeptides encode new members of the family 
of transcription factor proteins having a basic region plus leucine zipper: SEQ ID NO:410 
(42-191); SEQ ID NO:552 (1 16-288); SEQ ID NO:768 (1 16-288); SEQ ID NO:822 (108- 
262); SEQ ID NO:836 (158-353); SEQ ID NO: 1288 (73-234); SEQ ID NO: 1365 (69-257); 
SEQ ID NO:1540 (289-471); SEQ ID NO:1549 (200-391); SEQ ID NO:1556 (163-354); 

15 SEQ ID NO: 1557 (207-398); SEQ ID NO: 1563 (107-298); SEQ ID NO: 1622 (180-365); 
SEQ ID NO:1630 (100-291); SEQ ID NO:1704 (184-372); SEQ ID NO:1808 (36-161); 
SEQ ID NO:1454 (49-209); SEQ ID NO:2363 (48-211); SEQ ID NO:2424 (43-194); 
SEQ ID NO:3147 (190-369); SEQ ID NO:3152 (129-320); SEQ ID NO:3158 (167- 
334); and SEQ ID NO:3208 (34-256). 

20 SEQ ID NOs:186 (175-395); 2591 (60-165); 3307 (43-321); and 3339 

(94-342) encode polypeptides having an SH2 domain, and SEQ ID NOs:234 (23-121), 
1832 (18-173), and 1835 (57-206) encode polypeptides having an SH3 domain. Nine 
polypeptides encode new members of the family of proteins having Ank repeat regions: 
SEQ ID NO: 187 (358-432); SEQ ID NO: 1268 (238-315); SEQ ID NO: 1804 (301-378); 

25 SEQ ID NO:1819 (278-355); SEQ ID NO:1839 (224-307); SEQ ID NO:1830 (184-267); 
SEQ ID NO:2562 (18-101); SEQ ID NO:3015 (131-214); and SEQ ID NO:3267 (97- 
180). 

The following eleven polynucleotides encode polypeptides having a C2H2 
type zinc finger: SEQ ID NOs:308 (110-172); 807 (339-392); 1324 (294-356); 1503 (154- 

30 216); 1527 (156-212); 1674 (196-258); 1779 (64-126); 1801 (295-351); 3081 (190-252); 
3193 (293-355); and 3306 (161-223). Eight polynucleotides encode polypeptides of the 
family of ATPases: SEQ ID NOs:431 (71-428); 639 (157-561); 2135 (2-401); 2684 (9- 
461); 2859 (100-320); 3178 (45-386); 3197 (281-343) and 3266 (8-139). Polypeptides 
having a fibronectin type III domain are encoded by SEQ ID NO:746 (209-427) and 1 192 

35 (186-416). Polypeptides having an EF-hand domain are encoded by SEQ ID NO:820 (341- 
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406); 1755 (281-367) and 3285(16-102). Six polypeptides of the protein kinase family are 
encoded by SEQ ID NOs:l 157 (41-444); 1478 (54-437), 1496 (241-520); 2286 (12-182); 
2969 (5-387); and 3190 (1 18-390). 

LIM domain-containing polypeptides are encoded by SEQ ID NO: 1269 
5 (79-240); 1 309 (248-404); 1 360 (222-377); and 1 386 (243-398). Two polypeptides of the 
family having a C2 domain (protein kinase C-like) are encoded by SEQ ID NO:1325 (1- 
234) and 2282(183-353). Polypeptides having a WD domain, G-beta repeat motif are 
encoded by SEQ ID NOs:1336 (66-164); 1380 (42-140); 1711 (263-361); 1762 (236-334); 
1909 (160-258); 2218 (127-225); 3047 (191-292); 3108 (275-367) and 3292 (208-300). 
j0 SEQ ID NO:1410 (222-350) encodes a member of the trypsin family. SEQ 

ID NOs:1417 (8-354); 2281 (20-387) and 2310 (20-371) encode members of the protein 
tyrosine phosphatase family. SEQ ID NOs:1464 (4-180) and 1514 (2-252) encode 
members of the family having an RNA recognition motif (also known as RRM, RBD, or 
RNP domain). SEQ ID NOs:1496 (241-520) and 3297(7-153) encode helicases having a 
15 conserved C-terminal domain. SEQ ID NO: 1538 (9-635) encodes a member of the wnt 
family of developmental signaling proteins. 

Three polynucleotides encode polypeptides having a homeobox domain: 
SEQ ID NOs:1676 (9-86); 1820 (123-299); and 1821 (127-303). A novel thioredoxin is 
encoded by SEQ ID NO: 1677 (316-369). Two novel members of the ras family are 
20 encoded by SEQ ID NO: 1688(1 09-4 10) and 3258(138-394). A novel polypeptide having a 
phosphatidylinositol-specific phospholipase C Y-domain is encoded by SEQ ID NO:1707 
(92-439). A novel serine carboxypeptidase is encoded by SEQ ID NO: 1744 (238-433). A 
novel polypeptide having N-terminal homology in the Ets domain is encoded by SEQ ID 
NO:18l 1 (184-315). A novel polypeptide having a bromodomain is encoded by SEQ ID 
25 NO: 1 814 (127-294). A novel polypeptide having a double-stranded RNA binding motif is 
encoded by SEQ ID NO: 181 8 (9-146). A novel polypeptide having a G-protein alpha 
subunit is encoded by SEQ ID NO:1846 (12-398). 

SEQ ID NOs:1911 (35-151) and 1980 (60-197) encode polypeptides 
having a C3HC4 type zinc finger domain (RING finger). SEQ ID NO:2065 (253-306) 
30 encodes a polypeptide having a CCHC zinc finger domain. SEQ ID NO:22 1 6 (90- 1 79) 
encodes a polypeptide having a WW/rsp5/WWP domain. SEQ ID NO:2428 (25-350) 
encodes a polypeptide member of the dual specificity phosphatase family, having a 
catalytic domain. 

SEQ ID NOs:2577 (0-311); 3183 (14-215); and 3195 (0-215) encode 
35 members of the 4 transmembrane segment integral membrane protein family. SEQ ID 
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NOs:2826 (1 16-400) and 2871 (1 98-392) encode polypeptides of the DEAD and DEAH 
box helicase family. SEQ ID NO:2944 (18-281) encodes a polypeptide having a 
calpain large subunit, domain HI. 

SEQ ID NO:3274 (11-187) encodes a eukaryotic transcription factor 

5 with a fork head domain. SEQ ID NO:3345 (65-271) encodes a polypeptide having a 
PDZ domain, and SEQ ID NO:3351 (124-270) encodes a polypeptide in the family of 
phorbol esters/glycerol binding proteins. 

Described below are polynucleotide compositions encompassed by the 
invention, methods for obtaining cDNA or genomic DNA encoding a full-length gene 

10 product, expression of these polynucleotides and genes, identification of structural motifs 
of the polynucleotides and genes, identification of the function of a gene product encoded 
by a gene corresponding to a polynucleotide of the invention, use of the provided 
polynucleotides as probes and in mapping and in tissue profiling, use of the corresponding 
polypeptides and other gene products to raise antibodies, and use of the polynucleotides 

1 5 and their encoded gene products for therapeutic and diagnostic purposes. 

Polynucleotide Compositions 

The scope of the invention with respect to polynucleotide compositions 
includes, but is not necessarily limited to, polynucleotides having a sequence set forth in 
any one of SEQ ID NOs: 1-3351; polynucleotides obtained from the biological materials 

20 described herein or other biological sources (particularly human sources) by 
hybridization under stringent conditions (particularly conditions of high stringency); 
genes corresponding to the provided polynucleotides; variants of the provided 
polynucleotides and their corresponding genes, particularly those variants that retain a 
biological activity of the encoded gene product (e.g., a biological activity ascribed to a 

25 gene product corresponding to the provided polynucleotides as a result of the 
assignment of the gene product to a protein family(ies) and/or identification of a 
functional domain present in the gene product). Other nucleic acid compositions 
contemplated by and within the scope of the present invention will be readily apparent 
to one of ordinary skill in the art when provided with the disclosure here. 

30 "Polynucleotide" and "nucleic acid" as used herein with reference to nucleic acids of 
the composition is not intended to be limiting as to the length or structure of the nucleic 
acid unless specifically indicated. 

The invention features polynucleotides that are expressed in human 
tissue, specifically human colon, breast, and/or lung tissue. Novel nucleic acid 
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compositions of the invention comprise a sequence set forth in any one of SEQ ID 
NOs-1-3351 or an identifying sequence thereof. An "identifying sequence" is a 
contiguous sequence of residues at least about 10 nt to about 20 nt in length, usually at 
least about 50 nt to about 100 nt in length, that uniquely identifies a polynucleotide 
5 sequence, e.g., exhibits less than 90%, usually less than about 80% to about 85% 
sequence identity to any contiguous nucleotide sequence of more than about 20 nt. 
Thus, the subject novel nucleic acid compositions include full length cDNAs or mRNAs 
that encompass an identifying sequence of contiguous nucleotides from any one of SEQ 
ID NOs:l-3351. 

10 The polynucleotides of the invention also include polynucleotides having 

sequence similarity or sequence identity. Nucleic acids having sequence similarity are 
detected by hybridization under low stringency conditions, for example, at 50°C and 
10XSSC (0.9 M saline/0.09 M sodium citrate) and remain bound when subjected to 
washing at 55°C in 1XSSC. Sequence identity can be determined by hybridization 

15 under stringent conditions, for example, at 50°C or higher and 0.1XSSC (9 mM 
saline/0.9 mM sodium citrate). Hybridization methods and conditions are well known 
in the art, see, e.g., U.S. Patent No. 5,707,829. Nucleic acids that are substantially 
identical to the provided polynucleotide sequences, e.g., allelic variants, genetically 
altered versions of the gene, etc., bind to the provided polynucleotide sequences (SEQ 

20 ID NOs:l-3351) under stringent hybridization conditions. By using probes, particularly 
labeled probes of DNA sequences, one can isolate homologous or related genes. The 
source of homologous genes can be any species, e.g., primate species, particularly 
human; rodents, such as rats and mice; canines, felines, bovines, ovines, equines, yeast, 
nematodes, etc. 

25 Preferably, hybridization is performed using at least 15 contiguous 

nucleotides (nt) of at least one of SEQ ID NOs:l-3351. That is, when at least 15 
contiguous nt of one of the disclosed SEQ ID NOs. is used as a probe, the probe will 
preferentially hybridize with a nucleic acid comprising the complementary sequence, 
allowing the identification and retrieval of the nucleic acids that uniquely hybridize to 

30 the selected probe. Probes from more than one SEQ ID NO. can hybridize with the 
same nucleic acid if the cDNA from which they were derived corresponds to one 
mRNA. Probes of more than 15 nt can be used, e.g., probes of from about 18 nt to 
about 100 nt, but 15 nt represents sufficient sequence for unique identification. 

The polynucleotides of the invention also include naturally occurring 

35 variants of the nucleotide sequences (e.g., degenerate variants, allelic variants). 
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Variants of the polynucleotides of the invention are identified by hybridization of 
putative variants with nucleotide sequences disclosed herein, preferably by 
hybridization under stringent conditions. For example, by using appropriate wash 
conditions, variants of the polynucleotides of the invention can be identified where the 
5 allelic variant exhibits at most about 25-30% base pair (bp) mismatches relative to the 
selected polynucleotide probe. In general, allelic variants contain 15-25% bp 
mismatches, and can contain as little as even 5-15%, or 2-5%, or 1-2% bp mismatches, 
as well as a single bp mismatch. 

The invention also encompasses homologs corresponding to the 

10 polynucleotides of SEQ ID NOs:l-3351, where the source of homologous genes can be 
any mammalian species, e.g., primate species, particularly human; rodents, such as rats; 
canines, felines, bovines, ovines, equines, yeast, nematodes, etc. Between mammalian 
species, e.g., human and mouse, homologs generally have substantial sequence 
similarity, e.g., at least 75% sequence identity, usually at least 90%, more usually at 

15 least 95% between nucleotide sequences. Sequence similarity is calculated based on a 
reference sequence, which may be a subset of a larger sequence, such as a conserved 
motif, coding region, flanking region, etc. A reference sequence will usually be at least 
about 1 8 contiguous nt long, more usually at least about 30 nt long, and may extend to 
the complete sequence that is being compared. Algorithms for sequence analysis are 

20 known in the art, such as BLAST, described in Altschul et al., J. Mol Biol (1990) 
275:403-10. 

In general, variants of the invention have a sequence identity greater than 
at least about 65%, preferably at least about 75%, more preferably at least about 85%, 
and can be greater than at least about 90%, 91%, 92%, 93%, 94%, 95%, or 96%, most 

25 preferably 97%, 98% or 99%. For the purposes of this invention, a preferred method of 
calculating percent identity is the Smith-Waterman algorithm, using the following. 
Global DNA sequence identity must be greater than 65% as determined by the Smith- 
Waterman homology search algorithm as implemented in MPSRCH program (Oxford 
Molecular) using an affine gap search with the following search parameters: gap open 

30 penalty, 12; and gap extension penalty, 1. 

The subject nucleic acids can be cDNAs or genomic DNAs, as well as 
fragments thereof, particularly fragments that encode a biologically active gene product 
and/or are useful in the methods disclosed herein {e.g., in diagnosis, as a unique 
identifier of a differentially expressed gene of interest, etc.). The term "cDNA" as used 

35 herein is intended to include all nucleic acids that share the arrangement of sequence 

Q> 
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elements found in native mature mRNA species, where sequence elements are exons 
and 3' and 5' non-coding regions. Normally mRNA species have contiguous exons, 
with the intervening introns, when present, being removed by nuclear RNA splicing, to 
create a continuous open reading frame encoding a polypeptide of the invention. 
5 A genomic sequence of interest comprises the nucleic ac.d present 

between the initiation codon and the stop codon, as defined in the listed sequences, 
including all of the introns that are normally present in a native chromosome. It can 
further include the 3' and 5' untranslated regions found in the mature mRNA. It can 
further include specific transcriptional and translation^ regulatory sequences, such as 
10 promoters, enhancers, etc., including about 1 kb. but possibly more, of flanking 
genomic DNA at either the 5' and 3' end of the transcribed region. The genomic DNA 
can be isolated as a fragment of 100 kbp or smaller; and substantially free of flanking 
chromosomal sequence. The genomic DNA flanking the coding region, either 3' and 
5' or internal regulatory sequences as sometimes found in introns, contains sequences 
15 required for proper tissue, stage-specific, or disease-state specific expression. 

The nucleic acid compositions of the subject invention can encode all or 
a part of the subject polypeptides. Double or single stranded fragments can be obtained 
from the DNA sequence by chemically synthesizing oligonucleotides in accordance 
with conventional methods, by restriction enzyme digestion, by PCR amplification, etc. 
20 Isolated polynucleotides and polynucleotide fragments of the invention comprise at 
least about 10, about 15, about 20, about 35, about 50, about 100, about 150 to about 
200 about 250 to about 300, or about 350 contiguous nt selected from the 
polynucleotide sequences as shown in SEQ ID NOs:l-3351. The fragments also 
include those of lengths intermediate to the specifically mentioned lengths, such as 35 
25 36 37 38 39 etc.; 150, 151, 152, 153, 154, etc. For the most part, fragments will be of 
at least 15 nt, usually at least 18 nt or 25 nt, and up to at least about 50 contiguous nt ,n 
length or more. In a preferred embodiment, the polynucleotide molecules compnse a 
contiguous sequence of at least 12 nt selected from the group consisting of the 
polynucleotides shown in SEQ ID NOs: 1-3351. 
30 Probes specific to the polynucleotides of the invention can be generated 

using the polynucleotide sequences disclosed in SEQ ID NOs:l-3351. The probes are 
preferably at least about a 12, 15, 16, 18, 20, 22, 24, or 25 nt fragment of a 
corresponding contiguous sequence of SEQ ID NOs:l-3351, and can be less than 2 1, 
0 5 0.1, or 0.05 kb in length. The probes can be synthesized chemically or can be 
35 generated from longer polynucleotides using restriction enzymes. The probes can be 
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labeled, for example, with a radioactive, biotinylated, or fluorescent tag. Preferably 
probes are designed based upon an identifying sequence of a polynucleotide of one of 
SEQ ID NOsl-3351. More preferably, probes are designed based on a contiguous 
sequence of one of the subject polynucleotides that remain unmasked following 
5 application of a masking program for masking low complexity (e.g., XBLAST) to the 
sequence., /.*., one would select an unmasked region, as indicated by the 
polynucleotides outside the poly-n stretches of the masked sequence produced by the 

masking program. . 

The polynucleotides of the subject invention are isolated and obtained in 

10 substantial purity, generally as other than an intact chromosome. Usually, the 
polynucleotides, either as DNA or RNA, will be obtained substantially free of other 
naturally-occurring nucleic acid sequences, generally being at least about 50%, usually 
at least about 90% pure and are typically "recombinant", e.g., flanked by one or more 
nucleotides with which it is not normally associated on a naturally occurring 

15 chromosome. 

The polynucleotides of the invention can be provided as a linear 
molecule or within a circular molecule, and can be provided within autonomously 
replicating molecules (vectors) or within molecules without replication sequences. 
Expression of the polynucleotides can be regulated by their own or by other regulatory 
20 sequences known in the art. The polynucleotides of the invention can be introduced 
into suitable host cells using a variety of techniques available in the art, such as 
transferrin polycation-mediated DNA transfer, transfection with naked or encapsulated 
nucleic acids, liposome-mediated DNA transfer, intracellular transportation of DNA- 
coated latex beads, protoplast fusion, viral infection, electroporation, gene gun, calcium 
25 phosphate-mediated transfection, and the like. 

The subject nucleic acid compositions can be used to, for example, 
produce polypeptides, as probes for the detection of mRNA of the invention in 
biological samples (e.g., extracts of human cells) to generate additional copies of the 
polynucleotides, to generate ribozymes or antisense oligonucleotides, and as single 
30 stranded DNA probes or as triple-strand forming oligonucleotides. The probes 
described herein can be used to, for example, determine the presence or absence of the 
polynucleotide sequences as shown in SEQ ID NOs:l-3351 or variants thereof in a 
sample. These and other uses are described in more detail below. 
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Use of Polynucleotides to Obtain Full-Length cDNA. Gene, and Promoter Region 

Full-length cDNA molecules comprising the disclosed polynucleotides 
are obtained as follows. A polynucleotide having a sequence of one of SEQ ID NOs:l- 
3351, or a portion thereof comprising at least 12, 15, 18, or 20 nt, is used as a 
5 hybridization probe to detect hybridizing members of a cDNA library using probe 
design methods, cloning methods, and clone selection techniques such as those 
described in U.S. Patent No. 5,654,173. Libraries of cDNA are made from selected 
tissues, such as normal or tumor tissue, or from tissues of a mammal treated with, for 
example, a pharmaceutical agent. Preferably, the tissue is the same as the tissue from 

10 which the polynucleotides of the invention were isolated, as both the polynucleotides 
described herein and the cDNA represent expressed genes. Most preferably, the cDNA 
library is made from the biological material described herein in the Examples. The 
choice of cell type for library construction can be made after the identity of the protein 
encoded by the gene corresponding to the polynucleotide of the invention is known. 

15 This will indicate which tissue and cell types are likely to express the related gene, and 
thus represent a suitable source for the mRNA for generating the cDNA. As described 
in the Examples, cDNA of the invention was isolated from specific cell or tissue types, 
and such cells and tissues are preferable for obtaining related nucleic acids. 

Techniques for producing and probing nucleic acid sequence libraries are 

20 described, for example, in Sambrook et al., Molecular Cloning: A Laboratory Manual, 
2nd Ed, (1989) Cold Spring Harbor Press, Cold Spring Harbor, NY. The cDNA can be 
prepared by using primers based on sequence from SEQ ID NOs: 1-3351. In one 
embodiment, the cDNA library can be made from only poly-adenylated mRNA. Thus, 
poly-T primers can be used to prepare cDNA from the mRNA. 

25 Members of the library that are larger than the provided polynucleotides, 

and preferably that encompass the complete coding sequence of the native message, are 
obtained. In order to confirm that the entire cDNA has been obtained, RNA protection 
experiments are performed as follows. Hybridization of a full-length cDNA to an 
mRNA will protect the RNA from RNase degradation. If the cDNA is not full length, 

30 then the portions of the mRNA that are not hybridized will be subject to RNase 
degradation. This is assayed, as is known in the art, by changes in electrophoretic 
mobility on polyacrylamide gels, or by detection of released monoribonucleotides. 
Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd Ed, (1989) Cold 
Spring Harbor Press, Cold Spring Harbor, NY. In order to obtain additional sequences 
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5' to the end of a partial cDNA, 5* RACE (PCR Protocols: A Guide to Methods and 
Applications, (1990) Academic Press, Inc.) can be performed. 

Genomic DNA is isolated using the provided polynucleotides in a 
manner similar to the isolation of full-length cDNAs. Briefly, the provided 
5 polynucleotides, or portions thereof, are used as probes to libraries of genomic DNA. 
Preferably, the library is obtained from the cell type that was used to generate the 
polynucleotides of the invention, but this is not essential. Most preferably, the genomic 
DNA is obtained from the biological material described herein in the Examples. Such 
libraries can be in vectors suitable for carrying large segments of a genome, such as PI 

10 or YAC, as described in detail in Sambrook et al., 9.4-9.30. In addition, genomic 
sequences can be isolated from human BAC libraries, which are commercially available 
from Research Genetics, Inc., Huntsville, Alabama, USA, for example. In order to 
obtain additional 5' or 3* sequences, chromosome walking is performed, as described in 
Sambrook et al., such that adjacent and overlapping fragments of genomic DNA are 

15 isolated. These are mapped and pieced together, as is known in the art, using restriction 
digestion enzymes and DNA ligase. 

Using the polynucleotide sequences of the invention, corresponding full- 
length genes can be isolated using both classical and PCR methods to construct and 
probe cDNA libraries. Using either method, Northern blots, preferably, are performed 

20 on a number of cell types to determine which cell lines express the gene of interest at 
the highest level. Classical methods of constructing cDNA libraries are taught in 
Sambrook et al., supra. With these methods, cDNA can be produced from mRNA and 
inserted into viral or expression vectors. Typically, libraries of mRNA comprising 
poly(A) tails can be produced with poIy(T) primers. Similarly, cDNA libraries can be 

25 produced using the instant sequences as primers. 

PCR methods are used to amplify the members of a cDNA library that 
comprise the desired insert. In this case, the desired insert will contain sequence from 
the full length cDNA that corresponds to the instant polynucleotides. Such PCR 
methods include gene trapping and RACE methods as described in Gruber et al., WO 

30 95/04745 and Gruber et al., U.S. Patent No. 5,500,356. Kits are commercially available 
to perform gene trapping experiments from, for example, Life Technologies, 
Gaithersburg, Maryland, USA. In preferred embodiments of RACE, a common primer 
is designed to anneal to an arbitrary adaptor sequence ligated to cDNA ends (Apte and 
Siebert, Biotechniques (1993) 75:890-893; Edwards et al., Nuc. Acids Res. (1991) 

35 79:5227-5232). When a single gene-specific RACE primer is paired with the common 
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primer, preferential amplification of sequences between the single gene specific primer 
and the common primer occurs. Commercial cDNA pools modified for use in RACE 

are available. 

The promoter region of a gene generally is located 5' to the initiation site 
5 for RNA polymerase II. Hundreds of promoter regions contain the "TATA" box, a 
sequence such as TATTA or TATAA, which is sensitive to mutations. The promoter 
region can be obtained by performing 5' RACE using a primer from the coding region 
of the gene. Alternatively, the cDNA can be used as a probe for the genomic sequence, 
and the region 5' to the coding region is identified by "walking up." If the gene is 
10 highly expressed or differentially expressed, the promoter from the gene can be of use 
in a regulatory construct for a heterologous gene. 

Once the full-length cDNA or gene is obtained, DNA encoding variants 
can be prepared by site-directed mutagenesis, described in detail in Sambrook et al., 
15.3-15.63. The choice of codon or nucleotide to be replaced can be based on disclosure 
15 herein on optional changes in amino acids to achieve altered protein structure and/or 
function. 

As an alternative method to obtaining DNA or RNA from a biological 
material, nucleic acid comprising nucleotides having the sequence of one or more 
polynucleotides of the invention can be synthesized. Thus, the invention encompasses 
20 nucleic acid molecules ranging in length from 15 nt (corresponding to at least 15 
contiguous nt of one of SEQ ID NOs:l-3351) up to a maximum length suitable for one 
or more biological manipulations, including replication and expression, of the nucleic 
acid molecule. The invention includes but is not limited to (a) nucleic acid having the 
size of a full gene, and comprising at least one of SEQ ID NOs:l-3351; (b) the nucleic 

25 acid of (a) also comprising at least one additional polynucleotide or gene, operably 
linked to permit expression of a fusion protein; (c) an expression vector comprising (a) 
or (b); (d) a plasmid comprising (a) or (b) ; and (e) a recombinant viral particle 
comprising (a) or (b). Once provided with the polynucleotides disclosed herein, 
construction or preparation of (a) - (e) are well within the skill in the art. 

30 The sequence of a nucleic acid comprising at least 1 5 contiguous nt of at 

least any one of SEQ ID NOs:l-3351, preferably the entire sequence of at least any one 
of SEQ ID NOs: 1-3351, is not limited and can be any sequence of A, T, G, and/or C 
(for DNA) and A, U, G, and/or C (for RNA) or modified bases thereof, including 
inosine and pseudouridine. The choice of sequence will depend on the desired function 

35 and can be dictated by coding regions desired, the intron-like regions desired, and the 
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regulatory regions desired. Where the entire sequence of any one of SEQ ID NOs:l- 
3351 is within the nucleic acid, the nucleic acid obtained is referred to herein as a 
polynucleotide comprising the sequence of any one of SEQ ID NOs: 1-3351. 

Expression of Polypeptide Encoded by Full-Length cDNA or Full-Length Gene 
5 The provided polynucleotides (e.g., a polynucleotide having a sequence 

of one of SEQ ID NOs: 1-3351), the corresponding cDNA, or the full-length gene is 
used to express a partial or complete gene product. Constructs of polynucleotides 
having sequences of SEQ ID NOs: 1-3351 can be generated synthetically. Alternatively, 
single-step assembly of a gene and entire plasmid from large numbers of 

10 oligodeoxyribonucleotides is described by, e.g., Stemmer et al., Gene (Amsterdam) 
(1995) /<W(7):49-53. In this method, assembly PCR (the synthesis of long DNA 
sequences from large numbers of oligodeoxyribonucleotides (oligos)) is described. The 
method is derived from DNA shuffling (Stemmer, Nature (1994) 570:389-391), and 
does not rely on DNA ligase, but instead relies on DNA polymerase to build 

15 increasingly longer DNA fragments during the assembly process. 

Appropriate polynucleotide constructs are purified using standard 
recombinant DNA techniques as described in, for example, Sambrook et al., Molecular 
Cloning: A Laboratory Manual 2nd Ed, (1989) Cold Spring Harbor Press, Cold Spring 
Harbor, NY, and under current regulations described in United States Dept. of HHS, 

20 National Institute of Health (NIH) Guidelines for Recombinant DNA Research. The 
gene product encoded by a polynucleotide of the invention is expressed in any 
expression system, including, for example, bacterial, yeast, insect, amphibian and 
mammalian systems. Vectors, host cells and methods for obtaining expression in same 
are well known in the art. Suitable vectors and host cells are described in U.S. Patent 

25 No. 5,654,173. 

Polynucleotide molecules comprising a polynucleotide sequence 
provided herein are generally propagated by placing the molecule in a vector. Viral and 
non-viral vectors are used, including plasmids. The choice of plasmid will depend on 
the type of cell in which propagation is desired and the purpose of propagation. Certain 
30 vectors are useful for amplifying and making large amounts of the desired DNA 
sequence. Other vectors are suitable for expression in cells in culture. Still other 
vectors are suitable for transfer and expression in cells in a whole animal or person. The 
choice of appropriate vector is well within the skill of the art. Many such vectors are 
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available commercially. Methods for preparation of vectors comprising a desired 
sequence are well known in the art. 

The polynucleotides set forth in SEQ ID NOs: 1-3351 or their 
corresponding full-length polynucleotides are linked to regulatory sequences as 
5 appropriate to obtain the desired expression properties. These can include promoters 
(attached either at the 5' end of the sense strand or at the 3' end of the antisense strand), 
enhancers, terminators, operators, repressors, and inducers. The promoters can be 
regulated or constitutive. In some situations it may be desirable to use conditionally 
active promoters, such as tissue-specific or developmental stage-specific promoters. 

10 These are linked to the desired nucleotide sequence using the techniques described 
above for linkage to vectors. Any techniques known in the art can be used. 

When any appropriate host cells or organisms are used to replicate 
and/or express the polynucleotides or nucleic acids of the invention, the resulting 
replicated nucleic acid, RNA, expressed protein or polypeptide, is within the scope of 

1 5 the invention as a product of the host cell or organism. The product is recovered by any 
appropriate means known in the art. 

Once the gene corresponding to a selected polynucleotide is identified, 
its expression can be regulated in the cell to which the gene is native. For example, an 
endogenous gene of a cell can be regulated by an exogenous regulatory sequence as 

20 disclosed in U.S. Patent No. 5,64 1 ,670. 

Identification of Functional and Structural Motifs of Novel Genes 

Translations of the nucleotide sequence of the provided polynucleotides, 
cDNAs or full genes can be aligned with individual known sequences. Similarity with 
individual sequences can be used to determine the activity of the polypeptides encoded 
by the polynucleotides of the invention. Also, sequences exhibiting similarity with 
more than one individual sequence can exhibit activities that are characteristic of either 
or both individual sequences. 

The full length sequences and fragments of the polynucleotide sequences 
of the nearest neighbors can be used as probes and primers to identify and isolate the 
full length sequence corresponding to provided polynucleotides. The nearest neighbors 
can indicate a tissue or cell type to be used to construct a library for the full-length 
sequences corresponding to the provided polynucleotides. 

Typically, a selected polynucleotide is translated in all six frames to 
determine the best alignment with the individual sequences. The sequences disclosed 
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herein in the Sequence Listing are in a 5' to 3' orientation and translation in three 
frames can be sufficient. These amino acid sequences are referred to, generally, as 
query sequences, which will be aligned with the individual sequences. Databases with 
individual sequences are described in "Computer Methods for Macromolecular 

5 Sequence Analysis" Methods in Enzymology (1996) 266, Doolittle, Academic Press, 
Inc., a division of Harcourt Brace & Co., San Diego, California, USA. Databases 
include Genbank, EMBL, and DNA Database of Japan (DDBJ). 

Query and individual sequences can be aligned using the methods and 
computer programs described above, and include BLAST, available over the world 

10 wide web at http://www.ncbi.nlm.nhi.gov/BLAST. Another alignment algorithm is 
Fasta, available in the Genetics Computing Group (GCG) package, Madison, 
Wisconsin, USA, a wholly owned subsidiary of Oxford Molecular Group, Inc. Other 
techniques for alignment are described in Doolittle, supra. Preferably, an alignment 
program that permits gaps in the sequence is utilized to align the sequences. The 

15 Smith-Waterman is one type of algorithm that permits gaps in sequence alignments. 
See Meth. Mol Biol. (1997) 70: 173-187. Also, the GAP program using the Needteman 
and Wunsch alignment method can be utilized to align sequences. An alternative search 
strategy uses MPSRCH software, which runs on a MASPAR computer. MPSRCH uses 
a Smith-Waterman algorithm to score sequences on a massively parallel computer. 

20 This approach improves ability to identify sequences that are distantly related matches, 
and is especially tolerant of small gaps and nucleotide sequence errors. Amino acid 
sequences encoded by the provided polynucleotides can be used to search both protein 

and DNA databases. 

Hi ph Similarity . In general, in alignment results considered to be of high 

25 similarity, the percent of the alignment region length is typically at least about 55% of 
total length query sequence; more typically, at least about 58%; even more typically; at 
least about 60% of the total residue length of the query sequence. Usually, percent 
length of the alignment region can be as much as about 62%; more usually, as much as 
about 64%; even more usually, as much as about 66%. Further, for high similarity, the 

30 region of alignment, typically, exhibits at least about 75% of sequence identity; more 
typically, at least about 78%; even more typically; at least about 80% sequence identity. 
Usually, percent sequence identity can be as much as about 82%; more usually, as much 
as about 84%; even more usually, as much as about 86%. 

The p value is used in conjunction with these methods. If high similarity 

35 is found, the query sequence is considered to have high similarity with a profile 
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sequence when the p value is less than or equal to about 10 2 ; more usually; less than or 
equal to about 10' 3 ; even more usually; less than or equal to about 10 -4 . More typically, 
the p value is no more than about 10 5 ; more typically; no more than or equal to about 
10' 10 ; even more typically; no more than or equal to about 10 15 for the query sequence 

5 to be considered high similarity. 

Similarity Determined bv Sequence Identity Alone . Sequence identity 
alone can be used to determine similarity of a query sequence to an individual sequence 
and can indicate the activity of the sequence. Such an alignment, preferably, permits 
gaps to align sequences. Typically, the query sequence is related to the profile sequence 

10 if the sequence identity over the entire query sequence is at least about 15%; more 
typically, at least about 20%; even more typically, at least about 25%; even more 
typically, at least about 50%. Sequence identity alone as a measure of similarity is most 
useful when the query sequence is usually, at least 80 residues in length; more usually, 
90 residues; even more usually, at least 95 amino acid residues in length. More 

1 5 typically, similarity can be concluded based on sequence identity alone when the query 
sequence is preferably 100 residues in length; more preferably, 120 residues in length; 
even more preferably, 150 amino acid residues in length. 

Alignments with Profile and Multiple Aligned Sequences . Translations 
of the provided polynucleotides can be aligned with amino acid profiles that define 

20 either protein families or common motifs. Also, translations of the provided 
polynucleotides can be aligned to multiple sequence alignments (MSA) comprising the 
polypeptide sequences of members of protein families or motifs. Similarity or identity 
with profile sequences or MSAs can be used to determine the activity of the gene 
products (e.g., polypeptides) encoded by the provided polynucleotides or corresponding 

25 cDNA or genes. For example, sequences that show an identity or similarity with a 
chemokine profile or MSA can exhibit chemokine activities. 

Profiles can be designed manually by (1) creating an MSA, which is an 
alignment of the amino acid sequence of members that belong to the family and (2) 
constructing a statistical representation of the alignment. Such methods are described, 

30 for example, in Birney et al., Nucl Acid Res. (1996) 24(14): 2730-2739. MSAs of some 
protein families and motifs are publicly available. MSAs are described also in 
Sonnhammer et al., Proteins (1997) 28: 405-420. A brief description of MSAs is 
reported in Pascarella et al., Prot Eng. (1996) P(5):249-251. Techniques for building 
profiles from MSAs are described in Sonnhammer et al, supra; Birney et al., supra; 
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and "Computer Methods for Macromolecular Sequence Analysis," Methods in 
Enzymology (1996) 266, Doolittle, Academic Press, Inc., San Diego, California, USA. 

Similarity between a query sequence and a protein family or motif can be 
determined by (a) comparing the query sequence against the profile and/or (b) aligning 
5 the query sequence with the members of the family or motif. Typically, a program such 
as Searchwise is used to compare the query sequence to the statistical representation of 
the multiple alignment, also known as a profile (see Birney et ah, supra). Other 
techniques to compare the sequence and profile are described in Sonnhammer et al., 
supra and Doolittle, supra. 

10 Next, methods described by Feng et al., J. Mol Evol (1987) 25:351 and 

Higgins et al., CABIOS (1989) J: 151 can be used align the query sequence with the 
members of a family or motif, also known as a MSA. Sequence alignments can be 
generated using any of a variety of software tools. Examples include PileUp, which 
creates a multiple sequence alignment, and is described in Feng et al., J. Mol EvoL 

15 (1987) 25:351. Another method, GAP, uses the alignment method of Needleman et ah, 
J. Mol Biol. (1970) 45:443. GAP is best suited for global alignment of sequences. A 
third method, BestFit, functions by inserting gaps to maximize the number of matches 
using the local homology algorithm of Smith et al., Adv. Appl Math (1981) 2:482. In 
general, the following factors are used to determine if a similarity between a query 

20 sequence and a profile or MSA exists: (1) number of conserved residues found in the 
query sequence, (2) percentage of conserved residues found in the query sequence, (3) 
number of frameshifts, and (4) spacing between conserved residues. 

Some alignment programs that both translate and align sequences can 
make any number of frameshifts when translating the nucleotide sequence to produce 

25 the best alignment. The fewer frameshifts needed to produce an alignment, the stronger 
the similarity or identity between the query and profile or MSAs. For example, a weak 
similarity resulting from no frameshifts can be a better indication of activity or structure 
of a query sequence, than a strong similarity resulting from two frameshifts. Preferably, 
three or fewer frameshifts are found in an alignment; more preferably two or fewer 

30 frameshifts; even more preferably, one or fewer frameshifts; even more preferably, no 
frameshifts are found in an alignment of query and profile or MSAs. 

Conserved residues are those amino acids found at a particular position 
in all or some of the family or motif members. Alternatively, a position is considered 
conserved if only a certain class of amino acids is found in a particular position in all or 
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some of the family members. For example, the N-terminal position can contain a 
positively charged amino acid, such as lysine, arginine, or histidine. 



acids or a single amino acid is found at a particular position in at least about 40% of all 
5 class members; more typically, at least about 50%; even more typically, at least about 
60% of the members. Usually, a residue is conserved when a class or single amino acid 
is found in at least about 70% of the members of a family or motif; more usually, at 
least about 80%; even more usually, at least about 90%; even more usually, at least 
about 95%. 

10 A residue is considered conserved when three unrelated amino acids are 

found at a particular position in the some or all of the members; more usually, two 
unrelated amino acids. These residues are conserved when the unrelated amino acids 
are found at particular positions in at least about 40% of all class member; more 
typically, at least about 50%; even more typically, at least about 60% of the members. 

15 Usually, a residue is conserved when a class or single amino acid is found in at least 
about 70% of the members of a family or motif; more usually, at least about 80%; even 
more usually, at least about 90%; even more usually, at least about 95%. 



sequence comprises at least about 25% of the conserved residues of the profile or MSA; 
20 more usually, at least about 30%; even more usually; at least about 40%. Typically, the 
query sequence has a stronger similarity to a profile sequence or MSA when the query 
sequence comprises at least about 45% of the conserved residues of the profile or MSA; 
more typically, at least about 50%; even more typically; at least about 55%. 

Identification of Secreted and Membrane-Bound Polypeptides 

25 Both secreted and membrane-bound polypeptides of the present 

invention are of particular interest. For example, levels of secreted polypeptides can be 
assayed in body fluids that are convenient, such as blood, plasma, serum, and other 
body fluids such as urine, prostatic fluid and semen. Membrane-bound polypeptides are 
useful for constructing vaccine antigens or inducing an immune response. Such 

30 antigens would comprise all or part of the extracellular region of the membrane-bound 
polypeptides. Because both secreted and membrane-bound polypeptides comprise a 
fragment of contiguous hydrophobic amino acids, hydrophobicity predicting algorithms 
can be used to identify such polypeptides. 



Typically, a residue of a polypeptide is conserved when a class of amino 



A query sequence has similarity to a profile or MSA when the query 
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A signal sequence is usually encoded by both secreted and membrane- 
bound polypeptide genes to direct a polypeptide to the surface of the cell. The signal 
sequence usually comprises a stretch of hydrophobic residues. Such signal sequences 
can fold into helical structures. Membrane-bound polypeptides typically comprise at 
5 least one transmembrane region that possesses a stretch of hydrophobic amino acids that 
can transverse the membrane. Some transmembrane regions also exhibit a helical 
structure. Hydrophobic fragments within a polypeptide can be identified by using 
computer algorithms. Such algorithms include Hopp & Woods, Proc. Natl Acad. ScL 
USA (1981) 75:3824-3828; Kyte & Doolittle, J. Mol Biol (1982) 157: 105-132; and 

10 RAOAR algorithm, Degli Esposti et al., Eur. J. Biochem. (1990) 190: 207-219. 

Another method of identifying secreted and membrane-bound 
polypeptides is to translate the polynucleotides of the invention in all six frames and 
determine if at least 8 contiguous hydrophobic amino acids are present. Those 
translated polypeptides with at least 8; more typically, 10; even more typically, 12 

15 contiguous hydrophobic amino acids are considered to be either a putative secreted or 
membrane bound polypeptide. Hydrophobic amino acids include alanine, glycine, 
histidine, isoleucine, leucine, lysine, methionine, phenylalanine, proline, threonine, 
tryptophan, tyrosine, and valine 

Identification of the Function of an Expression Product of a Full-Length Gene 
20 Ribozymes, antisense constructs, and dominant negative mutants can be 

used to determine function of the expression product of a gene corresponding to a 
polynucleotide provided herein. The phosphoramidite method of oligonucleotide 
synthesis can be used to construct antisense molecules and ribozymes. See Beaucage et 
al., Tet. Lett. (1981) 22:1859 and U.S. Patent No. 4,668,777. Automated devices for 
25 synthesis are available to create oligonucleotides using this chemistry. Examples of 
such devices include Biosearch 8600, Models 392 and 394 by Applied Biosystems, a 
division of Perkin-Elmer Corp., Foster City, California, USA; and Expedite by 
Perceptive Biosystems, Framingham, Massachusetts, USA. Synthetic RNA, phosphate 
analog oligonucleotides, and chemically derivatized oligonucleotides can also be 
30 produced, and can be covalently attached to other molecules. RNA oligonucleotides 
can be synthesized, for example, using RNA phosphoramidites. This method can be 
performed on an automated synthesizer, such as Applied Biosystems, Models 392 and 
394, Foster City, California, USA. 
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Oligonucleotides of up to 200 nt can be synthesized, more typically, 100 
nt, more typically 50 nt; even more typically 30 to 40 nt. These synthetic fragments can 
be annealed and ligated together to construct larger fragments. See, for example, 
Sambrook et al., supra. Trans-cleaving catalytic RNAs (ribozymes) are RNA 
5 molecules possessing endoribonuclease activity. Ribozymes are specifically designed 
for a particular target, and the target message must contain a specific nucleotide 
sequence. They are engineered to cleave any RNA species site-specifically in the 
background of cellular RNA. The cleavage event renders the mRNA unstable and 
prevents protein expression. Importantly, ribozymes can be used to inhibit expression 

10 of a gene of unknown function for the purpose of determining its function in an in vitro 
or in vivo context, by detecting the phenotypic effect. 

Antisense nucleic acids are designed to specifically bind to RNA, 
resulting in the formation of RNA-DNA or RNA-RNA hybrids, with an arrest of DNA 
replication, reverse transcription or messenger RNA translation. Antisense 

15 polynucleotides based on a selected polynucleotide sequence can interfere with 
expression of the corresponding gene. Antisense polynucleotides are typically 
generated within the cell by expression from antisense constructs that contain the 
antisense strand as the transcribed strand. Antisense polynucleotides based on the 
disclosed polynucleotides will bind and/or interfere with the translation of mRNA 

20 comprising a sequence complementary to the antisense polynucleotide. The expression 
products of control cells and cells treated with the antisense construct are compared to 
detect the protein product of the gene corresponding to the polynucleotide upon which 
the antisense construct is based. The protein is isolated and identified using routine 
biochemical methods. 

25 Given the extensive background literature and clinical experience in 

antisense therapy, one skilled in the art can use selected polynucleotides of the 
invention as additional potential therapeutics. The choice of polynucleotide can be 
narrowed by first testing them for binding to "hot spot" regions of the genome of 
cancerous cells. If a polynucleotide is identified as binding to a "hot spot," testing the 

30 polynucleotide as an antisense compound in the corresponding cancer cells is 
warranted. 

Dominant negative mutations also are readily generated for 
corresponding proteins that are active as homomultimers. A mutant polypeptide will 
interact with wild-type polypeptides (made from the other allele) and form a non- 
35 functional multimer. Thus, a mutation is in a substrate-binding domain, a catalytic 
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domain or a cellular localization domain. Preferably, the mutant polypeptide will be 
overproduced. Point mutations are made that have such an effect. In addition, fusion of 
different polypeptides of various lengths to the terminus of a protein can yield dominant 
negative mutants. General strategies are available for making dominant negative 
mutants (see, e.g., Herskowitz, Nature (1987) 329:219). Such techniques can be used to 
create loss of function mutations, which are useful for determining protein function. 



Polypeptides ™H Variants Thereof 

The polypeptides of the invention include those encoded by the disclosed 
polynucleotides, as well as nucleic acids that, by virtue of the degeneracy of the genetic 
10 code are not identical in sequence to the disclosed polynucleotides. Thus, the invention 
includes within its scope a polypeptide encoded by a polynucleotide having the 
sequence of any one of SEQ ID NOs:l-3351 or a variant thereof. 

In general, the term "polypeptide" as used herein refers to both the full 
length polypeptide encoded by the recited polynucleotide, the polypeptide encoded by 
15 the gene represented by the recited polynucleotide, as well as portions or fragments 
thereof. "Polypeptides" also includes variants of the naturally occurring proteins, where 
such variants are homologous or substantially similar to the naturally occurring protein, 
and can be of an origin of the same or different species as the naturally occurring 
protein (e.g., human, murine, or some other species that naturally expresses the recited 
20 polypeptide, usually a mammalian species). In general, variant polypeptides have a 
sequence that has at least about 80%, usually at least about 90%, and more usually at 
least about 98% sequence identity with a differentially expressed polypeptide of the 
invention, as measured by BLAST using the parameters described above. The variant 
polypeptides can be naturally or non-naturally glycosylated, i.e., the polypeptide has a 
25 glycosylation pattern that differs from the glycosylation pattern found in the 
corresponding naturally occurring protein. 

The invention also encompasses homologs of the disclosed polypeptides 
(or fragments thereof) where the homologs are isolated from other species, i.e., other 
animal or plant species, where such homologs, usually mammalian species, e.g., 
30 rodents, such as mice, rats; domestic animals, e.g., horse, cow, dog, cat; and humans. 
By "homolog" is meant a polypeptide having at least about 35%, usually at least about 
40% and more usually at least about 60% amino acid sequence identity to a particular 
differentially expressed protein as identified above, where sequence .dentity is 
determined using the BLAST algorithm, with the parameters described above. 

V> 
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In general, the polypeptides of the subject invention are provided in a 



non-naturally occurring environment, e.g., are separated from their naturally occurring 
environment. In certain embodiments, the subject protein is present in a composition 
that is enriched for the protein as compared to a control. As such, purified polypeptide 
5 is provided, where by purified is meant that the protein is present in a composition that 
is substantially free of non-differentially expressed polypeptides, where by substantially 
free is meant that less than 90%, usually less than 60% and more usually less than 50% 
of the composition is made up of non-differentially expressed polypeptides. 



10 polypeptides include mutants, fragments, and fusions. Mutants can include amino acid 
substitutions, additions or deletions. The amino acid substitutions can be conservative 
amino acid substitutions or substitutions to eliminate non-essential amino acids, such as 
to alter a glycosylation site, a phosphorylation site or an acetylation site, or to minimize 
misfolding by substitution or deletion of one or more cysteine residues that are not 

15 necessary for function. Conservative amino acid substitutions are those that preserve 
the general charge, hydrophobicity/ hydrophilicity, and/or steric bulk of the amino acid 
substituted. Variants can be designed so as to retain biological activity of a particular 
region of the protein {e.g., a functional domain and/or, where the polypeptide is a 
member of a protein family, a region associated with a consensus sequence). Selection 

20 of amino acid alterations for production of variants can be based upon the accessibility 
(interior vs. exterior) of the amino acid (see, e.g., Go et al., Int. J. Peptide Protein Res. 
(1980) 7.5:211), the thermostability of the variant polypeptide (see, e.g., Querol et al., 
Prot. Eng. (1996) P:265), desired glycosylation sites (see, e.g., Olsen and Thomsen, J. 
Gen. Microbiol (1991) 757:579), desired disulfide bridges (see, e.g., Clarke et al., 

25 Biochemistry (1993) 52:4322; and Wakarchuk et al., Protein Eng. (1994) 7:1379), 
desired metal binding sites (see, e.g., Toma et al., Biochemistry (1991) 50:97, and 
Haezerbrouck et al., Protein Eng. (1993) 6:643), and desired substitutions with in 
proline loops (see, e.g., Masul et al., AppL Env. Microbiol. (1994) 60:3579). Cysteine- 
depleted muteins can be produced as disclosed in U.S. Patent No. 4,959,314. 

30 Variants also include fragments of the polypeptides disclosed herein, 

particularly biologically active fragments and/or fragments corresponding to functional 
domains. Fragments of interest will typically be at least about 10 aa to at least about 15 
aa in length, usually at least about 50 aa in length, and can be as long as 300 aa in length 
or longer, but will usually not exceed about 1000 aa in length, where the fragment will 

35 have a stretch of amino acids that is identical to a polypeptide encoded by a 



Also within the scope of the invention are variants; variants of 
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polynucleotide having a sequence of any SEQ ID NOs: 1-3351, or a homolog thereof. 
The protein variants described herein are encoded by polynucleotides that are within the 
scope of the invention. The genetic code can be used to select the appropriate codons to 
construct the corresponding variants. 

5 Computer-Related Embodiments 

In general, a library of polynucleotides is a collection of sequence 
information, which information is provided in either biochemical form (e.g., as a 
collection of polynucleotide molecules), or in electronic form {e.g., as a collection of 
polynucleotide sequences stored in a computer-readable form, as in a computer system 

10 and/or as part of a computer program). The sequence information of the 
polynucleotides can be used in a variety of ways, e.g., as a resource for gene discovery, 
as a representation of sequences expressed in a selected cell type (e.g., cell type 
markers), and/or as markers of a given disease or disease state. In general, a disease 
marker is a representation of a gene product that is present in all cells affected by 

15 disease either at an increased or decreased level relative to a normal cell (e.g., a cell of 
the same or similar type that is not substantially affected by disease). For example, a 
polynucleotide sequence in a library can be a polynucleotide that represents an mRNA, 
polypeptide, or other gene product encoded by the polynucleotide, that is either 
overexpressed or underexpressed in a breast ductal cell affected by cancer relative to a 

20 normal (i.e., substantially disease-free) breast cell. 

The nucleotide sequence information of the library can be embodied in 
any suitable form, e.g., electronic or biochemical forms. For example, a library of 
sequence information embodied in electronic form comprises an accessible computer 
data file (or, in biochemical form, a collection of nucleic acid molecules) that contains 

25 the representative nucleotide sequences of genes that are differentially expressed (e.g., 
overexpressed or underexpressed) as between, for example, i) a cancerous cell and a 
normal cell; ii) a cancerous cell and a dysplastic cell; iii) a cancerous cell and a cell 
affected by a disease or condition other than cancer; iv) a metastatic cancerous cell and 
a normal cell and/or non-metastatic cancerous cell; v) a malignant cancerous cell and a 

30 non-malignant cancerous cell (or a normal cell) and/or vi) a dysplastic cell relative to a 
normal cell. Other combinations and comparisons of cells affected by various diseases 
or stages of disease will be readily apparent to the ordinarily skilled artisan. 
Biochemical embodiments of the library include a collection of nucleic acids that have 
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the sequences of the genes in the library, where the nucleic acids can correspond to the 
entire gene in the library or to a fragment thereof, as described in greater detail below. 



sequence information of a plurality of polynucleotide sequences, where at least one of 
5 the polynucleotides has a sequence of any of SEQ ID NOs: 1-3351. By plurality is 
meant at least 2, usually at least 3 and can include up to all of SEQ ID NOs: 1-3351. 
The length and number of polynucleotides in the library will vary with the nature of the 
library, e.g., if the library is an oligonucleotide array, a cDNA array, a computer 
database of the sequence information, etc. 

10 Where the library is an electronic library, the nucleic acid sequence 

information can be present in a variety of media. "Media" refers to a manufacture, 
other than an isolated nucleic acid molecule, that contains the sequence information of 
the present invention. Such a manufacture provides the genome sequence or a subset 
thereof in a form that can be examined by means not directly applicable to the sequence 

15 as it exists in a nucleic acid. For example, the nucleotide sequence of the present 
invention, e.g., the nucleic acid sequences of any of the polynucleotides of SEQ ID 
NOs:l-3351, can be recorded on computer readable media, e.g., any medium that can be 
read and accessed directly by a computer. Such media include, but are not limited to: 
magnetic storage media, such as a floppy disc, a hard disc storage medium, and a 

20 magnetic tape; optical storage media such as CD-ROM; electrical storage media such as 
RAM and ROM; and hybrids of these categories such as magnetic/optical storage 
media. One of skill in the art can readily appreciate how any of the presently known 
computer readable mediums can be used to create a manufacture comprising a recording 
of the present sequence information. "Recorded" refers to a process for storing 

25 information on computer readable medium, using any such methods as known in the art. 
Any convenient data storage structure can be chosen, based on the means used to access 
the stored information. A variety of data processor programs and formats can be used 
for storage, e.g., word processing text file, database format, etc. In addition to the 
sequence information, electronic versions of the libraries of the invention can be 

30 provided in conjunction or connection with other computer-readable information and/or 
other types of computer-readable files (e.g., searchable files, executable files, etc., 
including, but not limited to, for example, search program software, etc.). 



information can be accessed for a variety of purposes. Computer software to access 
35 sequence information is publicly available. For example, the BLAST (Altschul et al., 



The polynucleotide libraries of the subject invention generally comprise 



By providing the nucleotide sequence in computer readable form, the 
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supra.) and BLAZE (Brutlag et al. Camp. Chen,. (1993) 77:203) search algorithms on a 
Sybase system can be used to identify open reading frames (ORFs) within the genome 
that contain homology to ORFs from other organisms. 

As used herein, "a computer-based system" refers to the hardware 
5 means, software means, and data storage means used to analyze the nucleotide sequence 
information of the present invention. The minimum hardware of the computer-based 
systems of the present invention comprises a central processing umt (CPU), input 
means, output means, and data storage means. A skilled artisan can readily appreciate 
that any one of the currently available computer-based system are suitable for use in the 
10 present invention. The data storage means can comprise any manufacture compnsmg a 
recording of the present sequence information as described above, or a memory access 
means that can access such a manufacture. 

"Search means" refers to one or more programs implemented on the 
computer-based system, to compare a target sequence or target structural motif, or 
15 expression levels of a polynucleotide in a sample, with the stored sequence mformation. 
Search means can be used to identify fragments or regions of the genome that match a 
particular target sequence or target motif. A variety of known algorithms are publicly 
known and commercially available, e.g., MacPattern (EMBL), BLASTN and BLASTX 
(NCB1) A "target sequence" can be any polynucleotide or amino acid sequence of six 
20 or more contiguous nucleotides or two or more amino acids, preferably from about 10 
t0 100 amino acids or from about 30 to 300 nt. A variety of comparing means can be 
used to accomplish comparison of sequence information from a sample (e.g. , to analyze 
target sequences, target motifs, or relative expression levels) with the data storage 
means A skilled artisan can readily recognize that any one of the publicly available 
25 homology search programs can be used as the search means for the computer based 
systems of the present invention to accomplish comparison of target sequences and 
motifs. Computer programs to analyze expression levels in a sample and in controls are 

also known in the art. 

A "target structural motif," or "target motif," refers to any rationally 
30 selected sequence or combination of sequences in which the sequence(s) are chosen 
based on a three-dimensional configuration that is formed upon the folding of the target 
motif or on consensus sequences of regulatory or active sites. There are a variety of 
target motifs known in the art. Protein target motifs include, but arc not limited to, 
enzyme active sites and signal sequences. Nucleic acid target motifs include, but are 
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not limited to, hairpin structures, promoter sequences and other expression elements 
such as binding sites for transcription factors. 

A variety of structural formats for the input and output means can be 
used to input and output the information in the computer-based systems of the present 

5 invention. One format for an output means ranks the relative expression levels of 
different polynucleotides. Such presentation provides a skilled artisan with a ranking of 
relative expression levels to determine a gene expression profile. 

As discussed above, the "library" of the invention also encompasses 
biochemical libraries of the polynucleotides of SEQ ID NOs:l-3351, e.g., collections of 

10 nucleic acids representing the provided polynucleotides. The biochemical libraries can 
take a variety of forms, e.g., a solution of cDNAs, a pattern of probe nucleic acids stably 
associated with a surface of a solid support (i.e., an array) and the like. Of particular 
interest are nucleic acid arrays in which one or more of SEQ ID NOs:l-3351 is 
represented on the array. By array is meant an article of manufacture that has at least a 

1 5 substrate with at least two distinct nucleic acid targets on one of its surfaces, where the 
number of distinct nucleic acids can be considerably higher, typically being at least 10 
nt, usually at least 20 nt and often at least 25 nt. A variety of different array formats 
have been developed and are known to those of skill in the art. The arrays of the subject 
invention find use in a variety of applications, including gene expression analysis, drug 

20 screening, mutation analysis and the like, as disclosed in the above-listed exemplary 
patent documents. 

In addition to the above nucleic acid libraries, analogous libraries of 
polypeptides are also provided, where the where the polypeptides of the library will 
represent at least a portion of the polypeptides encoded by SEQ ID NOs: 1-3351 . 

25 Use of Polynucleotide Probes in Mapping, and in Tissue Profiling 

Polynucleotide probes, generally comprising at least 12 contiguous nt of 
a polynucleotide as shown in the Sequence Listing, are used for a variety of purposes, 
such as chromosome mapping of the polynucleotide and detection of transcription 
levels. Additional disclosure about preferred regions of the disclosed polynucleotide 
30 sequences is found in the Examples. A probe that hybridizes specifically to a 
polynucleotide disclosed herein should provide a detection signal at least 5-, 10-, or 20- 
fold higher than the background hybridization provided with other unrelated sequences. 

Detection of Expression Levels . Nucleotide probes are used to detect 
expression of a gene corresponding to the provided polynucleotide. In Northern blots, 
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10 



mRNA is separated electrophoretically and contacted with a probe. A probe is detected 
as hybridizing to an mRNA species of a particular size. The amount of hybridization is 
quantitated to determine relative amounts of expression, for example under a particular 
condition. Probes are used for in situ hybridization to cells to detect expression. Probes 
can also be used in vivo for diagnostic detection of hybridizing sequences. Probes are 
typically labeled with a radioactive isotope. Other types of detectable labels can be 
used such as chromophores, fluors, and enzymes. Other examples of nucleotide 
hybridization assays are described in WO92/02526 and U.S. Patent No. 5,124,246. 

Alternatively, the Polymerase Chain Reaction (PCR) is another means 
for detecting small amounts of target nucleic acids (see, e.g., Mollis et al., Meth. 
Enzymol. (1987) 755:335; U.S. Patent No. 4,683,195; and U.S. Patent No. 4,683,202). 
Two primer polynucleotides nucleotides that hybridize with the target nucleic acids are 
used to prime the reaction. The primers can be composed of sequence within or 3' and 
5' to the polynucleotides of the Sequence Listing. Alternatively, if the primers are 3" and 
15 5' to these polynucleotides, they need not hybridize to them or the complements. After 
amplification of the target with a thermostable polymerase, the amplified target nucleic 
acids can be detected by methods known in the art, e.g., Southern blot. mRNA or 
cDNA can also be detected by traditional blotting techniques (e.g., Southern blot, 
Northern blot, etc.) described in Sambrook et al., "Molecular Cloning: A Laboratory 
20 Manual" (New York, Cold Spring Harbor Laboratory, 1989) {e.g., without PCR 
amplification). In general, mRNA or cDNA generated from mRNA using a polymerase 
enzyme can be purified and separated using gel electrophoresis, and transferred to a 
solid support, such as nitrocellulose. The solid support is exposed to a labeled probe, 
washed to remove any unhybridized probe, and duplexes containing the labeled probe^ 
25 are detected. 

Mapping . Polynucleotides of the present invention can be used to 
identify a chromosome on which the corresponding gene resides. Such mapping can be 
useful in identifying the function of the polynucleotide-related gene by its proximity to 
other genes with known function. Function can also be assigned to the polynucleotide- 

30 related gene when particular syndromes or diseases map to the same chromosome. For 
example, use of polynucleotide probes in identification and quantification of nucleic 
acid sequence aberrations is described in U.S. Patent No. 5,783,387. An exemplary 
mapping method is fluorescence in situ hybridization (FISH), which facilitates 
comparative genomic hybridization to allow total genome assessment of changes in 

35 relative copy number of DNA sequences (see, e.g., Valdes et al., Methods in Molecular 
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Biology (1997) 68:1). Polynucleotides can also be mapped to particular chromosomes 
using, for example, radiation hybrids or chromosome-specific hybrid panels. See Leach 
et al., Advances in Genetics, (1995) JJ:63-99; Walter et al., Nature Genetics (1994) 
7:22; Walter and Goodfellow, Trends in Genetics (1992) 9:352. Panels for radiation 
5 hybrid mapping are available from Research Genetics, Inc., Huntsville, Alabama, USA. 
The statistical program RHMAP can be used to construct a map based on the data from 
radiation hybridization with a measure of the relative likelihood of one order versus 
another. RHMAP is available via the world wide web at http://www.sph.umich.edu- 
/group/statgen/software. In addition, commercial programs are available for identifying 

10 regions of chromosomes commonly associated with disease, such as cancer. 

Tissue Typing or Profiling . Expression of specific mRNA 
corresponding to the provided polynucleotides can vary in different cell types and can 
be tissue-specific. This variation of mRNA levels in different cell types can be 
exploited with nucleic acid probe assays to determine tissue types. For example, PCR, 

15 branched DNA probe assays, or blotting techniques utilizing nucleic acid probes 
substantially identical or complementary to polynucleotides listed in the Sequence 
Listing can determine the presence or absence of the corresponding cDNA or mRNA. 

Tissue typing can be used to identify the developmental organ or tissue 
source of a metastatic lesion by identifying the expression of a particular marker of that 

20 organ or tissue. If a polynucleotide is expressed only in a specific tissue type, and a 
metastatic lesion is found to express that polynucleotide, then the developmental source 
of the lesion has been identified. Expression of a particular polynucleotide can be 
assayed by detection of either the corresponding mRNA or the protein product. 

Use of Polymorphisms . A polynucleotide of the invention can be used in 

25 forensics, genetic analysis, mapping, and diagnostic applications where the 
corresponding region of a gene is polymorphic in the human population. Any means for 
detecting a polymorphism in a gene can be used, including, but not limited to 
electrophoresis of protein polymorphic variants, differential sensitivity to restriction 
enzyme cleavage, and hybridization to allele-specific probes. 

30 Antibody Production 

Expression products of a polynucleotide of the invention, as well as the 
corresponding mRNA, cDNA, or complete gene, can be prepared and used for raising 
antibodies for experimental, diagnostic, and therapeutic purposes. For polynucleotides 
to which a corresponding gene has not been assigned, this provides an additional 
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method of identifying the corresponding gene. The polynucleotide or related cDNA is 
expressed as described above, and antibodies are prepared. These antibodies are 
specific to an epitope on the polypeptide encoded by the polynucleotide, and can 
precipitate or bind to the corresponding native protein in a cell or tissue preparation or 
5 in a cell-free extract of an in vitro expression system. 

Methods for production of monoclonal and polyclonal antibodies that 
specifically bind a selected antigen are well known in the art. The antibodies 
specifically bind to epitopes present in the polypeptides encoded by polynucleotides 
disclosed in the Sequence Listing. Typically, at least 6, 8, 10, or 12 contiguous amino 
10 acids are required to form an epitope. Epitopes that involve non-contiguous amino 
acids may require a longer polypeptide, e.g., at least 15, 25, or 50 amino acids. 
Antibodies that specifically bind to human polypeptides encoded by the provided 
polynucleotides should provide a detection signal at least 5-, 10-, or 20-fold higher than 
a detection signal provided with other proteins when used in Western blots or other 
15 immunochemical assays. Preferably, antibodies that specifically polypeptides of the 
invention do not bind to other proteins in immunochemical assays at detectable levels 
and can immunoprecipitate the specific polypeptide from solution. 

The invention also contemplates naturally occurring antibodies specific 
for a polypeptide of the invention. For example, serum antibodies to a polypeptide of 
20 the invention in a human population can be purified by methods well known in the art, 
e.g., by passing antiserum over a column to which the corresponding selected 
polypeptide or fusion protein is bound. The bound antibodies can then be eluted from 
the column, for example using a buffer with a high salt concentration. 

In addition to the antibodies discussed above, the invention also 
25 contemplates genetically engineered antibodies, antibody derivatives (e.g. , single chain 
antibodies, antibody fragments (e.g., Fab, etc.)), according to methods well known in 
the art. 

Other embodiments of the present invention include humanized 
monoclonal antibodies capable of binding to the polypeptides of the invention. The 

30 phrase "humanized antibody" refers to an antibody derived from a non-human antibody 
- typically a mouse monoclonal antibody. Alternatively, a humanized antibody may be 
derived from a chimeric antibody that retains or substantially retains the antigen- 
binding properties of the parental, non-human, antibody but which exhibits diminished 
immunogenicity as compared to the parental antibody when administered to humans. 

35 The phrase "chimeric antibody," as used herein, refers to an antibody containing 
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sequence derived from two different antibodies {see, e.g., U.S. Patent No. 4,816,567) 
which typically originate from different species. Most typically, chimeric antibodies 
comprise human and murine antibody fragments, generally human constant and mouse 
variable regions. 

5 Because humanized antibodies are far less immunogenic in humans than 

the parental mouse monoclonal antibodies, they can be used for the treatment of humans 
with far less risk of anaphylaxis. Thus, these antibodies may be preferred in therapeutic 
applications that involve in vivo administration to a human such as, e.g., use as radiation 
sensitizers for the treatment of neoplastic disease or use in methods to reduce the side 
1 0 effects of, e.g., cancer therapy. 

Humanized antibodies may be achieved by a variety of methods 
including, for example: (1) grafting the non-human complementarity determining 
regions (CDRs) onto a human framework and constant region (a process referred to in 
the art as "humanizing"), or, alternatively, (2) transplanting the entire non-human 
15 variable domains, but "cloaking" them with a human-like surface by replacement of 
surface residues (a process referred to in the art as "veneering"). In the present 
invention, humanized antibodies will include both "humanized" and "veneered" 
antibodies. These methods are disclosed in, e.g., Jones et al., Nature 527:522-525 
(1986); Morrison et at, Proc. Natl Acad. Sci., U.S.A., 57:6851-6855 (1984); Morrison 
20 and Oi, Adv. Immunol., 44:65-92 (1988); Verhoeyer et al., Science 259:1534-1536 
(1988); Padlan, Molec. Immun. 25:489-498 (1991); Padlan, Molec. Immunol 31(3):\69- 
217 (1994); and Kettleborough, C.A. et al., Protein Eng. 4(7):773-83 (1991) each of 
which is incorporated herein by reference. 

The phrase "complementarity determining region" refers to amino acid 
25 sequences which together define the binding affinity and specificity of the natural Fv 
region of a native immunoglobulin binding site. See t e.g., Chothia et al., J. Mol Biol 
796:901-917 (1987); Kabat et al., U.S. Dept. of Health and Human Services NIH 
Publication No. 91-3242 (1991). The phrase "constant region" refers to the portion of 
the antibody molecule that confers effector functions. In the present invention, mouse 
30 constant regions are substituted by human constant regions. The constant regions of the 
subject humanized antibodies are derived from human immunoglobulins. The heavy 
chain constant region can be selected from any of the five isotypes: alpha, delta, 
epsilon, gamma or mu. 

One method of humanizing antibodies comprises aligning the non- 
35 human heavy and light chain sequences to human heavy and light chain sequences, 
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selecting and replacing the non-human framework with a human framework based on 
such alignment, molecular modeling to predict the conformation of the humanized 
sequence and comparing to the conformation of the parent antibody. This process is 
followed by repeated back mutation of residues in the CDR region which disturb the 
5 structure of the CDRs until the predicted conformation of the humanized sequence 
model closely approximates the conformation of the non-human CDRs of the parent 
non-human antibody. Such humanized antibodies may be further derivatized to 
facilitate uptake and clearance, e.g., via Ashwell receptors. See, e.g., U.S. Patent Nos. 
5,530,101 and 5,585,089 which patents are incorporated herein by reference. 
10 ' ' Humanized antibodies can also be produced using transgenic animals 

that are engineered to contain human immunoglobulin loci. For example, WO 
98/24893 discloses transgenic animals having a human Ig locus wherein the animals do 
not produce functional endogenous immunoglobulins due to the inactivation of 
endogenous heavy and light chain loci. WO 91/10741 also discloses transgenic non- 
1 5 primate mammalian hosts capable of mounting an immune response to an immunogen, 
wherein the antibodies have primate constant and/or variable regions, and wherein the 
endogenous immunoglobulin-encoding loci are substituted or inactivated. WO 
96/30498 discloses the use of the Cre/Lox system to modify the immunoglobulin locus 
in a mammal, such as to replace all or a portion of the constant or variable region to 
20 form a modified antibody molecule. WO 94/02602 discloses non-human mammalian 
hosts having inactivated endogenous Ig loci and functional human Ig loci. U.S. Patent 
No. 5,939,598 discloses methods of making transgenic mice in which the mice lack 
endogenous heavy claims, and express an exogenous immunoglobulin locus comprising 
one or more xenogeneic constant regions. 
25 Using a transgenic animal described above, an immune response can be 

produced to a selected antigenic molecule, and antibody-producing cells can be 
removed from the animal and used to produce hybridomas that secrete human 
monoclonal antibodies. Immunization protocols, adjuvants, and the like are known in 
the art, and are used in immunization of, for example, a transgenic mouse as described 
30 in WO 96/33735. This publication discloses monoclonal antibodies against a variety of 
antigenic molecules including IL-6, IL-8, TNF , human CD4, L-selectin, gp39, and 
tetanus toxin. The monoclonal antibodies can be tested for the ability to inhibit or 
neutralize the biological activity or physiological effect of the corresponding protein. 
WO 96/33735 discloses that monoclonal antibodies against IL-8, derived from immune 
35 cells of transgenic mice immunized with IL-8, blocked IL-8-induced functions of 
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neutrophils. Human monoclonal antibodies with specificity for the antigen used to 
immunize transgenic animals are also disclosed in WO 96/34096. 

Polynucleotides or Arrays for Diagnostics 

5 Polynucleotide arrays are created by spotting polynucleotide probes onto 

a substrate (e.g., glass, nitrocellose, etc.) in a two-dimensional matrix or array having 
bound probes. The probes can be bound to the substrate by either covalent bonds or by 
non-specific interactions, such as hydrophobic interactions. Samples of polynucleotides 
can be detectably labeled (e.g., using radioactive or fluorescent labels) and then 

10 hybridized to the probes. Double stranded polynucleotides, comprising the labeled 
sample polynucleotides bound to probe polynucleotides, can be detected once the 
unbound portion of the sample is washed away. Techniques for constructing arrays and 
methods of using these arrays are described in EP 799 897; WO 97/29212; WO 
97/27317; EP 785 280; WO 97/02357; U.S. Patent No. 5,593,839; U.S. Patent No. 

15 5,578,832; EP 728 520; U.S. Patent No. 5,599,695; EP 721 016; U.S. Patent No. 
5,556,752; WO 95/22058; and U.S. Patent No. 5,631,734. Arrays can be used to, for 
example, examine differential expression of genes and can be used to determine gene 
function. For example, arrays can be used to detect differential expression of a 
polynucleotide between a test cell and control cell (e.g., cancer cells and normal cells). 

20 For example, high expression of a particular message in a cancer cell, which is not 
observed in a corresponding normal cell, can indicate a cancer specific gene product. 
Exemplary uses of arrays are further described in, for example, Pappalarado et al., Sem. 
Radiation Oncol (1998) 5:217; and Ramsay, Nature Biotechnol (1998) 16:40. 

Differential Expression in Diagnosis 

25 The polynucleotides of the invention can also be used to detect 

differences in expression levels between two cells, e.g., as a method to identify 
abnormal or diseased tissue in a human. For polynucleotides corresponding to profiles 
of protein families, the choice of tissue can be selected according to the putative 
biological function. In general, the expression of a gene corresponding to a specific 

30 polynucleotide is compared between a first tissue that is suspected of being diseased 
and a second, normal tissue of the human. The tissue suspected of being abnormal or 
diseased can be derived from a different tissue type of the human, but preferably it is 
derived from the same tissue type; for example an intestinal polyp or other abnormal 
growth should be compared with normal intestinal tissue. The normal tissue can be the 

21 
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same tissue as that of the test sample, or any normal tissue of the patient, especially 
those that express the polynucleotide-related gene of interest (e.g., brain, thymus, testis, 
heart, prostate, placenta, spleen, small intestine, skeletal muscle, pancreas, and the 
mucosal lining of the colon). A difference between the polynucleotide-related gene, 
5 mRNA, or protein in the two tissues which are compared, for example in molecular 
weight, amino acid or nucleotide sequence, or relative abundance, indicates a change in 
the gene, or a gene which regulates it, in the tissue of the human that was suspected of 
being diseased. Examples of detection of differential expression and its use in diagnosis 
of cancer are described in U.S. Patent Nos. 5,688,641 and 5,677,125. 
10 a genetic predisposition to disease in a human can also be detected by 

comparing expression levels of an mRNA or protein corresponding to a polynucleotide 
of the invention in a fetal tissue with levels associated in normal fetal tissue. Fetal 
tissues that are used for this purpose include, but are not limited to, amniotic fluid, 
chorionic villi, blood, and the blastomere of an in v/Vro-fertilized embryo. The 
15 comparable normal polynucleotide-related gene is obtained from any tissue. The mRNA 
or protein is obtained from a normal tissue of a human in which the polynucleotide- 
related gene is expressed. Differences such as alterations in the nucleotide sequence or 
size of the same product of the fetal polynucleotide-related gene or mRNA, or 
alterations in the molecular weight, amino acid sequence, or relative abundance of fetal 
20 protein, can indicate a germline mutation in the polynucleotide-related gene of the fetus, 
which indicates a genetic predisposition to disease. In general, diagnostic, prognostic, 
and other methods of the invention based on differential expression involve detection of 
a level or amount of a gene product, particularly a differentially expressed gene product, 
in a test sample obtained from a patient suspected of having or being susceptible to a 
25 disease (e.g. , breast cancer, lung cancer, colon cancer and/or metastatic forms thereof), 
and comparing the detected levels to those levels found in normal cells (e.g., cells 
substantially unaffected by cancer) and/or other control cells (e.g., to differentiate a 
cancerous cell from a cell affected by dysplasia). Furthermore, the severity of the 
disease can be assessed by comparing the detected levels of a differentially expressed 
30 gene product with those levels detected in samples representing the levels of 
differentially gene product associated with varying degrees of severity of disease. It 
should be noted that use of the term "diagnostic" herein is not necessarily meant to 
exclude "prognostic" or "prognosis," but rather is used as a matter of convenience. 

The term "differentially expressed gene" is generally intended to 
35 encompass a polynucleotide that can, for example, include an open reading frame 
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encoding a gene product {e.g., a polypeptide), and/or introns of such genes and adjacent 
5' and 3' non-coding nucleotide sequences involved in the regulation of expression, up 
to about 20 kb beyond the coding region, but possibly further in either direction. The 
gene can be introduced into an appropriate vector for extrachromosomal maintenance or 
5 for integration into a host genome. In general, a difference in expression level 
associated with a decrease in expression level of at least about 25%, usually at least 
about 50% to 75%, more usually at least about 90% or more is indicative of a 
differentially expressed gene of interest, i.e., a gene that is underexpressed or down- 
regulated in the test sample relative to a control sample. Furthermore, a difference in 

1 0 expression level associated with an increase in expression of at least about 25%, usually 
at least about 50% to 75%, more usually at least about 90% and can be at least about 
1 /2-fold, usually at least about 2-fold to about 10-fold, and can be about 100- fold to 
about 1,000-fold increase relative to a control sample is indicative of a differentially 
expressed gene of interest, i.e., an overexpressed or up-regulated gene. 

15 "Differentially expressed polynucleotide" as used herein means a nucleic 

acid molecule (RNA or DNA) comprising a sequence that represents a differentially 
expressed gene, e.g., the differentially expressed polynucleotide comprises a sequence 
(e.g., an open reading frame encoding a gene product) that uniquely identifies a 
differentially expressed gene so that detection of the differentially expressed 

20 polynucleotide in a sample is correlated with the presence of a differentially expressed 
gene in a sample. "Differentially expressed polynucleotides" is also meant to 
encompass fragments of the disclosed polynucleotides, e.g., fragments retaining 
biological activity, as well as nucleic acids homologous, substantially similar, or 
substantially identical (e.g., having about 90% sequence identity) to the disclosed 

25 polynucleotides. 



subject's susceptibility to a disease or disorder, determination as to whether a subject is 
presently affected by a disease or disorder, as well as to the prognosis of a subject 
affected by a disease or disorder (e.g., identification of pre-metastatic or metastatic 
30 cancerous states, stages of cancer, or responsiveness of cancer to therapy). The present 
invention particularly encompasses diagnosis of subjects in the context of breast cancer 
(e.g., carcinoma in situ (e.g., ductal carcinoma in situ), estrogen receptor (ER)-positive 
breast cancer, ER-negative breast cancer, or other forms and/or stages of breast cancer), 
lung cancer (e.g., small cell carcinoma, non-small cell carcinoma, mesothelioma, and 



Diagnosis" as used herein generally includes determination of a 
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other forms and/or stages of lung cancer), and colon cancer (e.g., adenomatous polyp, 
colorectal carcinoma, and other forms and/or stages of colon cancer). 



meant to refer to samples of biological fluids or tissues, particularly samples obtained 
5 from tissues, especially from cells of the type associated with the disease for which the 
diagnostic application is designed (e.g., ductal adenocarcinoma), and the like. 
"Samples" is also meant to encompass derivatives and fractions of such samples (e.g., 
cell lysates). Where the sample is solid tissue, the cells of the tissue can be dissociated 
or tissue sections can be analyzed. 

10 Methods of the subject invention useful in diagnosis or prognosis 

typically involve comparison of the abundance of a selected differentially expressed 
gene product in a sample of interest with that of a control to determine any relative 
differences in the expression of the gene product, where the difference can be measured 
qualitatively ancl/or quantitatively. Quantitation can be accomplished, for example, by 

15 comparing the level of expression product detected in the sample with the amounts of 
product present in a standard curve. A comparison can be made visually; by using a 
technique such as densitometry, with or without computerized assistance; by preparing 
a representative library of cDNA clones of mRNA isolated from a test sample, 
sequencing the clones in the library to determine that number of cDNA clones 

20 corresponding to the same gene product, and analyzing the number of clones 
corresponding to that same gene product relative to the number of clones of the same 
gene product in a control sample; or by using an array to detect relative levels of 
hybridization to a selected sequence or set of sequences, and comparing the 
hybridization pattern to that of a control. The differences in expression are then 

25 correlated with the presence or absence of an abnormal expression pattern. A variety of 
different methods for determining the nucleic acid abundance in a sample are known to 
those of skill in the art (see, e.g., WO 97/273 17).In general, diagnostic assays of the 
invention involve detection of a gene product of a the polynucleotide sequence (e.g., 
mRNA or polypeptide) that corresponds to a sequence of SEQ ID NOs: 1-3351. The 

30 patient from whom the sample is obtained can be apparently healthy, susceptible to 
disease (e.g., as determined by family history or exposure to certain environmental 
factors), or can already be identified as having a condition in which altered expression 
of a gene product of the invention is implicated. 



35 levels of a gene product encoded by at least one, preferably at least two or more, at least 



'Sample" or "biological sample" as used throughout here are generally 



Diagnosis can be determined based on detected gene product expression 
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3 or more, or at least 4 or more of the polynucleotides having a sequence set forth m 
SEQ ID NOs:l-3351, and can involve detection of expression of genes corresponding to 
all of SEQ ID NOs:l-3351 and/or additional sequences that can serve as additional 
diagnostic markers and/or reference sequences. Where the diagnostic method is 
5 designed to detect the presence or susceptibility of a patient to cancer, the assay 
preferably involves detection of a gene product encoded by a gene corresponding to a 
polynucleotide that is differentially expressed in cancer. Examples of such differentially 
expressed polynucleotides are described in the Examples below. Given the provided 
polynucleotides and information regarding their relative expression levels provided 
10 herein, assays using such polynucleotides and detection of their expression levels in 
diagnosis and prognosis will be readily apparent to the ordinarily skilled artisan. 

Any of a variety of detectable labels can be used in connection with the 
various embodiments of the diagnostic methods of the invention. Suitable detectable 
labels include fluorochromes, (e.g., fluorescein isothiocyanate (F1TC), rhodamine, 
15 Texas Red, phycoerythrin, allophycocyanin, 6-carboxyfluorescein (6-FAM), 
2' 7'-dimethoxy-4',5'-dichloro-6-carboxyfluorescein, 6-carboxy-X-rhodamine (ROX), 
6^arboxy-2',4\7\4,7-hexachlorofluorescein (HEX), 5-carboxyfluorescein (5-FAM) or 
N,N,N',N'-tetramethyl-6-carboxyrhodamine (TAMRA)), radioactive labels, (e.g., 32 ?, 
4, 3 H,etc), and the like. The detectable label can involve a two stage systems (e.g., 
20 biotin-avidin, hapten-anti-hapten antibody, etc.) 

Reagents specific for the polynucleotides and polypeptides of the 
invention, such as antibodies and nucleotide probes, can be supplied in a kit for 
detecting the presence of an expression product in a biological sample. The kit can also 
contain buffers or labeling components, as well as instructions for using the reagents to 
25 detect and quantify expression products in the biological sample. Exemplary 
embodiments of the diagnostic methods of the invention are described below in more 
detail. 

Polypeptide detection in diagnosis . In one embodiment, the test sample 
is assayed for the level of a differentially expressed polypeptide. Diagnosis can be 

30 accomplished using any of a number of methods to determine the absence or presence 
or altered amounts of the differentially expressed polypeptide in the test sample. For 
example, detection can utilize staining of cells or histological sections with labeled 
antibodies, performed in accordance with conventional methods. Cells can be 
permeabilized to stain cytoplasmic molecules. In general, antibodies that specifically 

35 bind a differentially expressed polypeptide of the invention are added to a sample, and 

*6 
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incubated for a period of time sufficient to allow binding to the epitope, usually at least 
about 10 minutes. The antibody can be detectably labeled for direct detection (e.g., 
using radioisotopes, enzymes, fluoresces, chemiluminescers, and the like), or can be 
used in conjunction with a second stage antibody or reagent to detect binding (e.g., 
5 biotin with horseradish peroxidase-conjugated avidin, a secondary antibody conjugated 
to a fluorescent compound, e.g., fluorescein, rhodamine, Texas red, etc.). The absence 
or presence of antibody binding can be determined by various methods, including flow 
cytometry of dissociated cells, microscopy, radiography, scintillation counting, etc. 
Any suitable alternative methods can of qualitative or quantitative detection of levels or 
10 amounts of differentially expressed polypeptide can be used, for example ELISA, 
western blot, immunoprecipitation, radioimmunoassay, etc. 



alternatively involve detection of mRNA encoded by a gene corresponding to a 
differentially expressed polynucleotides of the invention. Any suitable qualitative or 

15 quantitative methods known in the art for detecting specific mRNAs can be used. 
mRNA can be detected by, for example, in situ hybridization in tissue sections, by 
reverse transcriptase-PCR, or in Northern blots containing poly A+ mRNA. One of 
skill in the art can readily use these methods to determine differences in the size or 
amount of mRNA transcripts between two samples. mRNA expression levels in a 

20 sample can also be determined by generation of a library of expressed sequence tags 
(ESTs) from the sample, where the EST library is representative of sequences present in 
the sample (Adams, et ah, (1991) Science 252:1651). Enumeration of the relative 
representation of ESTs within the library can be used to approximate the relative 
representation of the gene transcript within the starting sample. The results of EST 

25 analysis of a test sample can then be compared to EST analysis of a reference sample to 
determine the relative expression levels of a selected polynucleotide, particularly a 
polynucleotide corresponding to one or more of the differentially expressed genes 
described herein. Alternatively, gene expression in a test sample can be performed 
using serial analysis of gene expression (SAGE) methodology (e.g., Velculescu et al. s 



30 Science (1995) 270:484) or differential display (DD) methodology (see, e.g., U.S. 
Patent NOs. 5,776,683 and 5,807,680). 



analysis. Oligonucleotides or cDNA can be used to selectively identify or capture DNA 
or RNA of specific sequence composition, and the amount of RNA or cDNA hybridized 
35 to a known capture sequence determined qualitatively or quantitatively, to provide 



mRNA detection . The diagnostic methods of the invention can also or 



Alternatively, gene expression can be analyzed using hybridization 
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information about the relative representation of a particular message within the pool of 
cellular messages in a sample. Hybridization analysis can be designed to allow for 
concurrent screening of the relative expression of hundreds to thousands of genes by 
using, for example, array-based technologies having high density formats, including 
5 filters, microscope slides, or microchips, or solution-based technologies that use 
spectroscopic analysis (e.g. , mass spectrometry). One exemplary use of arrays in the 
diagnostic methods of the invention is described below in more detail. 

Use of a single gene in diagnostic applications . The diagnostic methods 
of the invention can focus on the expression of a single differentially expressed gene. 

10 For example, the diagnostic method can involve detecting a differentially expressed 
gene, or a polymorphism of such a gene (e.g., a polymorphism in an coding region or 
control region), that is associated with disease. Disease-associated polymorphisms can 
include deletion or truncation of the gene, mutations that alter expression level and/or 
affect activity of the encoded protein, etc. 

1 5 A number of methods are available for analyzing nucleic acids for the 

presence of a specific sequence, e.g. , a disease associated polymorphism. Where large 
amounts of DNA are available, genomic DNA is used directly. Alternatively, the 
region of interest is cloned into a suitable vector and grown in sufficient quantity for 
analysis. Cells that express a differentially expressed gene can be used as a source of 

20 mRNA, which can be assayed directly or reverse transcribed into cDNA for analysis. 
The nucleic acid can be amplified by conventional techniques, such as the polymerase 
chain reaction (PCR), to provide sufficient amounts for analysis, and a detectable label 
can be included in the amplification reaction (e.g., using a detectably labeled primer or 
detectably labeled oligonucleotides) to facilitate detection. Alternatively, various 

25 methods are also known in the art that utilize oligonucleotide ligation as a means of 
detecting polymorphisms, see e.g., Riley et al., Nucl. Acids Res. (1990) 75:2887; and 
Delahunty et al., Am. J. Hum. Genet. (1996) 55:1239. 

The amplified or cloned sample nucleic acid can be analyzed by one of a 
number of methods known in the art. The nucleic acid can be sequenced by dideoxy or 

30 other methods, and the sequence of bases compared to a selected sequence, e.g., to a 
wild-type sequence. Hybridization with the polymorphic or variant sequence can also 
be used to determine its presence in a sample (e.g., by Southern blot, dot blot, etc). The 
hybridization pattern of a polymorphic or variant sequence and a control sequence to an 
array of oligonucleotide probes immobilized on a solid support, as described in U.S. 

35 Patent No. 5,445,934, or in WO 95/35505, can also be used as a means of identifying 



WO 01/02568 



PCTAJS00/18374 



polymorphic or variant sequences associated with disease. Single strand 
conformational polymorphism (SSCP) analysis, denaturing gradient gel electrophoresis 
(DGGE), and heteroduplex analysis in gel matrices are used to detect conformational 
changes created by DNA sequence variation as alterations in electrophoretic mobility. 
5 Alternatively, where a polymorphism creates or destroys a recognition site for a 
restriction endonuclease, the sample is digested with that endonuclease, and the 
products size fractionated to determine whether the fragment was digested. 
Fractionation is performed by gel or capillary electrophoresis, particularly acrylamide or 
agarose gels. 

10 Screening for mutations in a gene can be based on the functional or 

antigenic characteristics of the protein. Protein truncation assays are useful in detecting 
deletions that can affect the biological activity of the protein. Various immunoassays 
designed to detect polymorphisms in proteins can be used in screening. Where many 
diverse genetic mutations lead to a particular disease phenotype, functional protein 

15 assays have proven to be effective screening tools. The activity of the encoded protein 
can be determined by comparison with the wild-type protein. 

Pattern matching in diagnosis using arrays . In another embodiment, the 
diagnostic and/or prognostic methods of the invention involve detection of expression 
of a selected set of genes in a test sample to produce a test expression pattern (TEP). 

20 The TEP is compared to a reference expression pattern (REP), which is generated by 
detection of expression of the selected set of genes in a reference sample (e.g., a 
positive or negative control sample). The selected set of genes includes at least one of 
the genes of the invention, which genes correspond to the polynucleotide sequences of 
SEQ ID NOs: 1-3351. Of particular interest is a selected set of genes that includes genes 

25 differentially expressed in the disease for which the test sample is to be screened. 

"Reference sequences" or "reference polynucleotides' 1 as used herein in 
the context of differential gene expression analysis and diagnosis/prognosis refers to a 
selected set of polynucleotides, which selected set includes at least one or more of the 
differentially expressed polynucleotides described herein. A plurality of reference 

30 sequences, preferably comprising positive and negative control sequences, can be 
included as reference sequences. Additional suitable reference sequences are found in 
Genbank, Unigene, and other nucleotide sequence databases (including, e.g., expressed 
sequence tag (EST), partial, and full-length sequences). 

"Reference array" means an array having reference sequences for use in 

35 hybridization with a sample, where the reference sequences include all, at least one of, 



WO 01/02568 



PCT/USOO/18374 



or any subset of the differentially expressed polynucleotides described herein. Usually 
such an array will include at least 3 different reference sequences, and can include any 
one or all of the provided differentially expressed sequences. Arrays of interest can 
further comprise sequences, including polymorphisms, of other genetic sequences, 
5 particularly other sequences of interest for screening for a disease or disorder (e.g., 
cancer, dysplasia, or other related or unrelated diseases, disorders, or conditions). The 
oligonucleotide sequence on the array will usually be at least about 12 nt in length, and 
can be of about the length of the provided sequences, or can extend into the flanking 
regions to generate fragments of 100 nt to 200 nt in length or more. Reference arrays 
10 can be produced according to any suitable methods known in the art. For example, 
methods of producing large arrays of oligonucleotides are described in U.S. Patent NOs. 
5,134,854 and 5,445,934 using light-directed synthesis techniques. Using a computer 
controlled system, a heterogeneous array of monomers is converted, through 
simultaneous coupling at a number of reaction sites, into a heterogeneous array of 
15 polymers. Alternatively, microarrays are generated by deposition of pre-synthesized 
oligonucleotides onto a solid substrate, for example as described in PCT published 

application no. WO 95/35505. 

A "reference expression pattern" or "REP" as used herein refers to the 
relative levels of expression of a selected set of genes, particularly of differentially 

20 expressed genes, that is associated with a selected cell type, e.g., a normal cell, a 
cancerous cell, a cell exposed to an environmental stimulus, and the like. A "test 
expression pattern" or "TEP" refers to relative levels of expression of a selected set of 
genes, particularly of differentially expressed genes, in a test sample (e.g., a cell of 
unknown or suspected disease state, from which mRNA is isolated). 

25 REPs can be generated in a variety of ways according to methods well 

known in the art. For example, REPs can be generated by hybridizing a control sample 
to an array having a selected set of polynucleotides (particularly a selected set of 
differentially expressed polynucleotides), acquiring the hybridization data from the 
array, and storing the data in a format that allows for ready comparison of the REP with 

30 a TEP. Alternatively, all expressed sequences in a control sample can be isolated and 
sequenced, e.g., by isolating mRNA from a control sample, converting the mRNA into 
cDNA, and sequencing the cDNA. The resulting sequence information roughly or 
precisely reflects the identity and relative number of expressed sequences in the sample. 
The sequence information can then be stored in a format (e.g., a computer-readable 
35 format) that allows for ready comparison of the REP with a TEP. The REP can be 
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normalized prior to or after data storage, and/or can be processed to selectively remove 
sequences of expressed genes that are of less interest or that might complicate analysis 
(e.g., some or all of the sequences associated with housekeeping genes can be 
eliminated from REP data). 
5 TEPs can be generated in a manner similar to REPs, e.g., by hybridizing 

a test sample to an array having a selected set of polynucleotides, particularly a selected 
set of differentially expressed polynucleotides, acquiring the hybridization data from the 
array, and storing the data in a format that allows for ready comparison of the TEP with 
a REP. The REP and TEP to be used in a comparison can be generated simultaneously, 

10 or the TEP can be compared to previously generated and stored REPs. 

In one embodiment of the invention, comparison of a TEP with a REP 
involves hybridizing a test sample with a reference array, where the reference array has 
one or more reference sequences for use in hybridization with a sample. The reference 
sequences include all, at least one of, or any subset of the differentially expressed 

1 5 polynucleotides described herein. Hybridization data for the test sample is acquired, the 
data normalized, and the produced TEP compared with a REP generated using an array 
having the same or similar selected set of differentially expressed polynucleotides. 
Probes that correspond to sequences differentially expressed between the two samples 
will show decreased or increased hybridization efficiency for one of the samples 

20 relative to the other. 

Methods for collection of data from hybridization of samples with a 
reference arrays are well known in the art. For example, the polynucleotides of the 
reference and test samples can be generated using a detectable fluorescent label, and 
hybridization of the polynucleotides in the samples detected by scanning the 

25 microarrays for the presence of the detectable label using, for example, a microscope 
and light source for directing light at a substrate. A photon counter detects fluorescence 
from the substrate, while an x-y translation stage varies the location of the substrate. A 
confocal detection device that can be used in the subject methods is described in U.S. 
Patent No. 5,631,734. A scanning laser microscope is described in Shalon et al., 

30 Genome Res. (1996) 6:639. A scan, using the appropriate excitation line, is performed 
for each fluorophore used. The digital images generated from the scan are then 
combined for subsequent analysis. For any particular array element, the ratio of the 
fluorescent signal from one sample (e.g., a test sample) is compared to the fluorescent 
signal from another sample (e.g., a reference sample), and the relative signal intensity 

35 determined. 

40 
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Methods for analyzing the data collected from hybridization to arrays are 
well known in the art. For example, where detection of hybridization involves a 
fluorescent label, data analysis can include the steps of determining fluorescent intensity 
as a function of substrate position from the data collected, removing outliers, i.e., data 
5 deviating from a predetermined statistical distribution, and calculating the relative 
binding affinity of the targets from the remaining data. The resulting data can be 
displayed as an image with the intensity in each region varying according to the binding 
affinity between targets and probes. 

In general, the test sample is classified as having a gene expression 

10 profile corresponding to that associated with a disease or non-disease state by 
comparing the TEP generated from the test sample to one or more REPs generated from 
reference samples (e.g., from samples associated with cancer or specific stages of 
cancer, dysplasia, samples affected by a disease other than cancer, normal samples, 
etc.). The criteria for a match or a substantial match between a TEP and a REP include 

15 expression of the same or substantially the same set of reference genes, as well as 
expression of these reference genes at substantially the same levels (e.g., no significant 
difference between the samples for a signal associated with a selected reference 
sequence after normalization of the samples, or at least no greater than about 25% to 
about 40% difference in signal strength for a given reference sequence. In general, a 

20 pattern match between a TEP and a REP includes a match in expression, preferably a 
match in qualitative or quantitative expression level, of at least one of, all or any subset 
of the differentially expressed genes of the invention. 

Pattern matching can be performed manually, or can be performed using 
a computer program. Methods for preparation of substrate matrices (e.g., arrays), 

25 design of oligonucleotides for use with such matrices, labeling of probes, hybridization 
conditions, scanning of hybridized matrices, and analysis of patterns generated, 
including comparison analysis, are described in, for example, U.S. Patent No. 
5,800,992. 

Diagnosis, Prognosis and Management of Cancer 
30 The polynucleotides of the invention and their gene products are of 

particular interest as genetic or biochemical markers (e.g., in blood or tissues) that will 
detect the earliest changes along the carcinogenesis pathway and/or to monitor the 
efficacy of various therapies and preventive interventions. For example, the level of 
expression of certain polynucleotides can be indicative of a poorer prognosis, and 
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therefore warrant more aggressive chemo- or radio-therapy for a patient or vice versa. 
The correlation of novel surrogate tumor specific features with response to treatment 
and outcome in patients can define prognostic indicators that allow the design of 
tailored therapy based on the molecular profile of the tumor. These therapies include 

5 antibody targeting and gene therapy. Determining expression of certain polynucleotides 
and comparison of a patients profile with known expression in normal tissue and 
variants of the disease allows a determination of the best possible treatment for a 
patient, both in terms of specificity of treatment and in terms of comfort level of the 
patient. Surrogate tumor markers, such as polynucleotide expression, can also be used 

10 to better classify, and thus diagnose and treat, different forms and disease states of 
cancer. Two classifications widely used in oncology that can benefit from identification 
of the expression levels of the polynucleotides of the invention are staging of the 
cancerous disorder, and grading the nature of the cancerous tissue. 

The polynucleotides of the invention can be useful to monitor patients 

15 having or susceptible to cancer to detect potentially malignant events at a molecular 
level before they are detectable at a gross morphological level. Furthermore, a 
polynucleotide of the invention identified as important for one type of cancer can also 
have implications for development or risk of development of other types of cancer, e.g., 
where a polynucleotide is differentially expressed across various cancer types. Thus, 

20 for example, expression of a polynucleotide that has clinical implications for metastatic 
colon cancer can also have clinical implications for stomach cancer or endometrial 
cancer. 

Staging . Staging is a process used by physicians to describe how 
advanced the cancerous state is in a patient. Generally, if a cancer is only detectable in 

25 the area of the primary lesion without having spread to any lymph nodes it is called 
Stage I. If it has spread only to the closest lymph nodes, it is called Stage II. In Stage 
III, the cancer has generally spread to the lymph nodes in near proximity to the site of 
the primary lesion. Cancers that have spread to a distant part of the body, such as the 
liver, bone, brain or other site, are Stage IV, the most advanced stage. 

30 The polynucleotides of the invention can facilitate fine-tuning of the 

staging process by identifying markers for the aggresivity of a cancer, e.g., the 
metastatic potential, as well as the presence in different areas of the body. Thus, a Stage 
II cancer with a polynucleotide signifying a high metastatic potential cancer can be used 
to change a borderline Stage II tumor to a Stage III tumor, justifying more aggressive 
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therapy. Conversely, the presence of a polynucleotide signifying a lower metastatic 
potential allows more conservative staging of a tumor. 

Grading of cancers . Grade is a term used to describe how closely a 
tumor resembles normal tissue of its same type. The microscopic appearance of a tumor 
5 is used to identify tumor grade based on parameters such as cell morphology, cellular 
organization, and other markers of differentiation. As a general rule, the grade of a 
tumor corresponds to its rate of growth or aggressiveness, with undifferentiated or high- 
grade tumors being more aggressive than well differentiated or low-grade tumors. The 
following guidelines are generally used for grading tumors: 1) GX Grade cannot be 

10 assessed; 2) Gl Weil differentiated; G2 Moderately well differentiated; 3) G3 Poorly 
differentiated; 4) G4 Undifferentiated. The polynucleotides of the invention can be 
especially valuable in determining the grade of the tumor, as they not only can aid in 
determining the differentiation status of the cells of a tumor, they can also identify 
factors other than differentiation that are valuable in determining the aggressivity of a 

1 5 tumor, such as metastatic potential. 

Detection of lung cancer . The polynucleotides of the invention can be 
used to detect lung cancer in a subject. Although there are more than a dozen different 
kinds of lung cancer, the two main types of lung cancer are small cell and nonsmall cell, 
which encompass about 90% of all lung cancer cases. Small cell carcinoma (also called 

20 oat cell carcinoma) usually starts in one of the larger bronchial tubes, grows fairly 
rapidly, and is likely to be large by the time of diagnosis. Nonsmall cell lung cancer 
(NSCLC) is made up of three general subtypes of lung cancer. Epidermoid carcinoma 
(also called squamous cell carcinoma) usually starts in one of the larger bronchial tubes 
and grows relatively slowly. The size of these tumors can range from very small to 

25 quite large. Adenocarcinoma starts growing near the outside surface of the lung and can 
vary in both size and growth rate. Some slowly growing adenocarcinomas are described 
as alveolar cell cancer. Large cell carcinoma starts near the surface of the lung, grows 
rapidly, and the growth is usually fairly large when diagnosed. Other less common 
forms of lung cancer are carcinoid, cylindroma, mucoepidermoid, and malignant 

30 mesothelioma. 

The polynucleotides of the invention, e.g., polynucleotides differentially 
expressed in normal cells versus cancerous lung cells (e.g., tumor cells of high or low 
metastatic potential) or between types of cancerous lung cells (e.g., high metastatic 
versus low metastatic), can be used to distinguish types of lung cancer as well as 
35 identifying traits specific to a certain patient's cancer and selecting an appropriate 

Hi 



WO 01/02568 



PCTAJSOO/18374 



therapy For example, if the patient's biopsy expresses a polynucleotide that is 
associated with a low metastatic potential, it may justify leaving a larger portion of the 
patient's lung in surgery to remove the lesion. Alternatively, a smaller lesion with 
expression of a polynucleotide that is associated with high metastatic potential may 

5 justify a more radical removal of lung tissue and/or the surrounding lymph nodes, even 
if no metastasis can be identified through pathological examination. 

rw.t;»n nf hreast cancer . The majority of breast cancers are 
adenocarcinomas subtypes, which can be summarized as follows: 1) ductal carcinoma 
in situ (DCIS), including comedocarcinoma; 2) infiltrating (or invasive) ductal 

10 carcinoma (IDC); 3) lobular carcinoma in situ (LCIS); 4) infiltrating (or invasive) 
lobular carcinoma (ILC); 5) inflammatory breast cancer; 6) medullary carcinoma; 
7) mucinous carcinoma; 8)Paget's disease of the nipple; 9)Phyllodes tumor; and 

10) tubular carcinoma. 

The expression of polynucleotides of the invention can be used in the 
,5 diagnosis and management of breast cancer, as well as to distinguish between types of 
breast cancer. Detection of breast cancer can be determined using expression levels of 
any of the appropriate polynucleotides of the invention, either alone or in combination. 
Determination of the aggressive nature and/or the metastatic potential of a breast cancer 
can also be determined by comparing levels of one or more polynucleotides of the 
20 invention and comparing levels of another sequence known to vary in cancerous tissue, 
eg. ER expression. In addition, development of breast cancer can be detected by 
examining the ratio of expression of a differentially expressed polynucleotide to the 
levels of steroid hormones (e.g., testosterone or estrogen) or to other hormones (e.g., 
growth hormone, insulin). Thus expression of specific marker polynucleotides can be 
25 used to discriminate between normal and cancerous breast tissue, to discriminate 
between breast cancers with different cells of origin, to discriminate between breast 
cancers with different potential metastatic rates, etc. 

rw.,;nn nf mlnn cancer . The polynucleotides of the invention 
exhibiting the appropriate expression pattern can be used to detect colon cancer in a 
30 subject. Colorectal cancer is one of the most common neoplasms in humans and 
perhaps the most frequent form of hereditary neoplasia. Prevention and early detection 
are key factors in controlling and curing colorectal cancer. Colorectal cancer begins as 
polyps, which are small, benign growths of cells that form on the inner lining of the 
colon Over a period of several years, some of these polyps accumulate additional 
35 mutations and become cancerous. Multiple familial colorectal cancer disorders have 
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been identified, which are summarized as follows: 1) Familial adenomatous polyposis 
(FAP)- 2) Gardner's syndrome; 3) Hereditary nonpolyposis colon cancer (HNPCC); and 
4) Familial colorectal cancer in Ashkenazi Jews. The expression of appropriate 
polynucleotides of the invention can be used in the diagnosis, prognose and 
5 management of colorectal cancer. Detection of colon cancer can be determined using 
expression levels of any of these sequences alone or in combination with the levels of 
expression. Determination of the aggressive nature and/or the metastatic potential of a 
colon cancer can be determined by comparing levels of one or more polynucleotides of 
the invention and comparing total levels of another sequence known to vary in 
10 cancerous tissue, e.g., expression of P 53, DCC ras, lor FAP (see, e.g., Fearon ER, et 
Cell (1990) 67(J):759; Hamilton SR et al., Cancer (1993) 72:957; Bodmer W, et al., 
Nat Genet. (1994) 4(3):21T, Fearon ER, Ann N Y Acad Sci. (1995) 768:101). For 
example, development of colon cancer can be detected by examining the ratio of any of 
the polynucleotides of the invention to the levels of oncogenes (e.g., ras) or tumor 
15 suppressor genes (e.g., FAP or P 53). Thus expression of specific marker 
polynucleotides can be used to discriminate between normal and cancerous colon tissue, 
to discriminate between colon cancers with different cells of origin, to discriminate 
between colon cancers with different potential metastatic rates, etc. 

no, »f Pol ynucleotides to Sr r^n fnr Peptide Analops and Antagonists 
20 Polypeptides encoded by the instant polynucleotides and corresponding 

full length genes can be used to screen peptide libraries to identify binding partners, 
such as receptors, from among the encoded polypeptides. Peptide libraries can be 
synthesized according to methods known in the art (see, e.g., U.S. Patent No. 5,010,175, 
and WO 91/17823). Agonists or antagonists of the polypeptides if the invention can be 
25 screened using any available method known in the art, such as signal transduction, 
antibody binding, receptor binding, mitogenic assays, chemotaxis assays, etc. The 
assay conditions ideally should resemble the conditions under which the native activity 
is exhibited in vivo, that is, under physiologic P H, temperature, and ionic strength. 
Suitable agonists or antagonists will exhibit strong inhibition or enhancement of the 
30 native activity at concentrations that do not cause toxic side effects in the subject. 
Agonists or antagonists that compete for binding to the native polypeptide can require 
concentrations equal to or greater than the native concentration, while inhibitors capable 
of binding irreversibly to the polypeptide can be added in concentrations on the order of 
the native concentration. 
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Such screening and experimentation can lead to identification of a novel 
polypeptide binding partner, such as a receptor, encoded by a gene or a cDNA 
corresponding to a polynucleotide of the invention, and at least one peptide agonist or 
antagonist of the novel binding partner. Such agonists and antagonists can be used to 
5 modulate, enhance, or inhibit receptor function in cells to which the receptor is native, 
or in cells that possess the receptor as a result of genetic engineering. Further, if the 
novel receptor shares biologically important characteristics with a known receptor, 
information about agonistyantagonist binding can facilitate development of improved 
agonists/antagonists of the known receptor. 

10 Pharmaceutical Compositions and The rapeutic Uses 

Pharmaceutical compositions of the invention can comprise 
polypeptides, antibodies, or polynucleotides (including antisense nucleotides and 
ribozymes) of the claimed invention in a therapeutically effective amount. The term 
"therapeutically effective amount" as used herein refers to an amount of a therapeutic 

15 agent to treat, ameliorate, or prevent a desired disease or condition, or to exhibit a 
detectable therapeutic or preventative effect. The effect can be detected by, for 
example, chemical markers or antigen levels. Therapeutic effects also include reduction 
in physical symptoms, such as decreased body temperature. The precise effective 
amount for a subject will depend upon the subject's size and health, the nature and 

20 extent of the condition, and the therapeutics or combination of therapeutics selected for 
administration. Thus, it is not useful to specify an exact effective amount in advance. 
However, the effective amount for a given situation is determined by routine 
experimentation and is within the judgment of the clinician. For purposes of the present 
invention, an effective dose will generally be from about 0.01 rag/ kg to 50 mg/kg or 

25 0.05 mg/kg to about 10 mg/kg of the DNA constructs in the individual to which it is 
administered. 

A pharmaceutical composition can also contain a pharmaceutically 
acceptable carrier. The term "pharmaceutically acceptable carrier" refers to a carrier for 
administration of a therapeutic agent, such as antibodies or a polypeptide, genes, and 
30 other therapeutic agents. The term refers to any pharmaceutical carrier that does not 
itself induce the production of antibodies harmful to the individual receiving the 
composition, and which can be administered without undue toxicity. Suitable carriers 
can be large, slowly metabolized macromolecules such as proteins, polysaccharides, 
polylactic acids, polyglycolic acids, polymeric amino acids, amino acid copolymers, 
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and inactive virus particles. Such carriers are well known to those of ordinary skill in 
the art. Pharmaceutically acceptable carriers in therapeutic compositions can include 
liquids such as water, saline, glycerol and ethanol. Auxiliary substances, such as 
wetting or emulsifying agents, pH buffering substances, and the like, can also be present 
5 in such vehicles. Typically, the therapeutic compositions are prepared as injectables, 
either as liquid solutions or suspensions; solid forms suitable for solution in, or 
suspension in, liquid vehicles prior to injection can also be prepared. Liposomes are 
included within the definition of a pharmaceutically acceptable carrier. 
Pharmaceutically acceptable salts can also be present in the pharmaceutical 

10 composition, e.g., mineral acid salts such as hydrochlorides, hydrobromides, 
phosphates, sulfates, and the like; and the salts of organic acids such as acetates, 
propionates, malonates, benzoates, and the like. A thorough discussion of 
pharmaceutically acceptable excipients is available in Remington's Pharmaceutical 
Sciences (Mack Pub. Co., New Jersey, 1991). 

15 Delivery Methods . Once formulated, the compositions of the invention 

can be (1) administered directly to the subject {e.g., as polynucleotide or polypeptides); 
or (2) delivered ex vivo, to cells derived from the subject (e.g., as in ex vivo gene 
therapy). Direct delivery of the compositions will generally be accomplished by 
parenteral injection, e.g., subcutaneously, intraperitoneal^, intravenously or 

20 intramuscularly, intratumoral or to the interstitial space of a tissue. Other modes of 
administration include oral and pulmonary administration, suppositories, and 
transdermal applications, needles, and gene guns or hyposprays. Dosage treatment can 
be a single dose schedule or a multiple dose schedule. 



25 into a subject are known in the art and described in e.g., International Publication No. 
WO 93/14778. Examples of cells useful in ex vivo applications include, for example, 
stem cells, particularly hematopoetic, lymph cells, macrophages, dendritic cells, or 
tumor cells. Generally, delivery of nucleic acids for both ex vivo and in vitro 
applications can be accomplished by, for example, dextran-mediated transfection, 

30 calcium phosphate precipitation, polybrene mediated transfection, protoplast fusion, 
electroporation, encapsulation of the polynucleotide(s) in liposomes, and direct 
microinjection of the DNA into nuclei, all well known in the art. 



found to correlate with a proliferative disorder, such as neoplasia, dysplasia, and 
35 hyperplasia, the disorder can be amenable to treatment by administration of a 



Methods for the ex vivo delivery and reimplantation of transformed cells 



Once a gene corresponding to a polynucleotide of the invention has been 
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therapeutic agent based on the provided polynucleotide, corresponding polypeptide or 
other corresponding molecule (e.g. , antisense, ribozyme, etc.). 

The dose and the means of administration of the inventive 
pharmaceutical compositions are determined based on the specific qualities of the 
5 therapeutic composition, the condition, age, and weight of the patient, the progression 
of the disease, and other relevant factors. For example, administration of 
polynucleotide therapeutic compositions agents of the invention includes local or 
systemic administration, including injection, oral administration, particle gun or 
catheterized administration, and topical administration. Preferably, the therapeutic 
10 polynucleotide composition contains an expression construct comprising a promoter 
operably linked to a polynucleotide of at least 12, 22, 25, 30, or 35 contiguous nt of the 
polynucleotide disclosed herein. Various methods can be used to administer the 
therapeutic composition directly to a specific site in the body. For example, a small 
metastatic lesion is located and the therapeutic composition injected several times in 
several different locations within the body of tumor. Alternatively, arteries which serve 
a tumor are identified, and the therapeutic composition injected into such an artery, m 
order to deliver the composition directly into the tumor. A tumor that has a necrotic 
center is aspirated and the composition injected directly into the now empty center of 
the tumor. The antisense composition is directly administered to the surface of the 
tumor, for example, by topical application of the composition. X-ray imaging is used to 
assist in certain of the above delivery methods. 

Receptor-mediated targeted delivery of therapeutic compositions 
containing an antisense polynucleotide, subgenomic polynucleotides, or antibodies to 
specific tissues can also be used. Receptor-mediated DNA delivery techniques are 
25 described in, for example, Findeis et al., Trends Biotechnol. (1993) 77:202; Chiou et al., 
Gene Therapeutics: Methods And Applications Of Direct Gene Transfer (J.A. Wolff, 
ed ) (1994); Wu et al., J. Biol. Chem. (1988) 263:621; Wu et al., J. Biol. Chem. (1994) 
260:542; Zenke et al., Proc. Natl. Acad. Sci. (USA) (1990) 57:3655; Wu et al., J. Biol. 
Chem (1991) 266:338. Therapeutic compositions containing a polynucleotide are 
30 administered in a range of about 100 ng to about 200 mg of DNA for local 
administration in a gene therapy protocol. Concentration ranges of about 500 ng to 
about 50 mg, about 1 mg to about 2 mg, about 5 mg to about 500 mg, and about 20 mg 
to about 100 mg of DNA can also be used during a gene therapy protocol. Factors such 
as method of action (e.g., for enhancing or inhibiting levels of the encoded gene 
35 product) and efficacy of transformation and expression are considerations which will 
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affect the dosage required for ultimate efficacy of the antisense subgenomic 
polynucleotides. Where greater expression is desired over a larger area of tissue, larger 
amounts of antisense subgenomic polynucleotides or the same amounts readministered 
in a successive protocol of administrations, or several administrations to different 
5 adjacent or close tissue portions of, for example, a tumor site, may be required to effect 
a positive therapeutic outcome. In all cases, routine experimentation in clinical trials 
will determine specific ranges for optimal therapeutic effect. For polynucleotide-related 
genes encoding polypeptides or proteins with anti-inflammatory activity, suitable use, 
doses, and administration are described in U.S. Patent No. 5,654,173. 
10 The therapeutic polynucleotides and polypeptides of the present 

invention can be delivered using gene delivery vehicles. The gene delivery vehicle can 
be of viral or non-viral origin (see generally, Jolly, Cancer Gene Therapy (1994) /:51; 
Kimura, Human Gene Therapy (1994) 5:845; Connelly, Human Gene Therapy (1995) 
7:185; and Kaplitt, Nature Genetics (1994) 6:148). Expression of such coding 
15 sequences can be induced using endogenous mammalian or heterologous promoters. 
Expression of the coding sequence can be either constitutive or regulated. 

Viral-based vectors for delivery of a desired polynucleotide and 
expression in a desired cell are well known in the art. Exemplary viral-based vehicles 
include, but are not limited to, recombinant retroviruses, (see, e.g., WO 90/07936; WO 
20 94/03622; WO 93/25698; WO 93/25234; U.S. Patent No. 5, 219,740; WO 93/11230; 
WO 93/10218; U.S. Patent No. 4,777,127; GB Patent No. 2,200,651; EP 0 345 242; and 
WO 91/02805), alphavirus-based vectors (e.g., Sindbis virus vectors, Semliki forest 
virus (ATCC VR-67; ATCC VR-1247), Ross River virus (ATCC VR-373; ATCC VR- 
1246) and Venezuelan equine encephalitis virus (ATCC VR-923; ATCC VR-1250; 
25 ATCC VR 1249; ATCC VR-532), and adeno-associated virus (AAV) vectors (see, e.g., 
WO 94/12649, WO 93/03769; WO 93/19191; WO 94/28938; WO 95/11984 and WO 
95/00655). Administration of DNA linked to killed adenovirus as described in Curiel, 
Hum. Gene Ther. (1992) 5:147 can also be employed. 

Non-viral delivery vehicles and methods can also be employed, 
30 including, but not limited to, polycationic condensed DNA linked or unlinked to killed 
adenovirus alone (see, e.g., Curiel, Hum. Gene Ther. (1992) 5:147); ligand-linked 
DNA(see, e.g., Wu,J. Biol. Chem. 264:16985 (1989)); eukaryotic cell delivery vehicles 
cells (see, e.g., U.S. Patent No. 5,814,482; WO 95/07994; WO 96/17072; 
WO 95/30763; and WO 97/42338) and nucleic charge neutralization or fusion with cell 
35 membranes. Naked DNA can also be employed. Exemplary naked DNA introduction 
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methods are described in WO 90/11092 and U.S. Patent No. 5,580,859. Liposomes that 
can act as gene delivery vehicles are described in U.S. Patent No. 5,422,120; WO 
95/13796; WO 94/23697; WO 91/14445; and EP 0524968. Additional approaches are 
described in Philip, Mol. Cell Biol. 74:2411 (1994), and in Woffendin, Proc. Natl. 

5 Acad. Sci. ( 1 994) 91: 1581. 

Further non-viral delivery suitable for use includes mechanical delivery 
systems such as the approach described in Woffendin et al., Proc. Natl. Acad. Sci. USA 
9 7(24): 1 1581 (1994). Moreover, the coding sequence and the product of expression of 
such can be delivered through deposition of photopolymerized hydrogel materials or 

0 use of ionizing radiation (see, e.g., U.S. Patent No. 5,206,152 and WO 92/11033). 
Other conventional methods for gene delivery that can be used for delivery of the 
coding sequence include, for example, use of hand-held gene transfer particle gun (see, 
e.g., U.S. Patent No. 5,149,655); use of ionizing radiation for activating transferred gene 
(see, e.g., U.S. Patent No. 5,206,152 and WO 92/1 1033). 

l5 The present invention will now be illustrated by reference to the 

following examples which set forth particularly advantageous embodiments. However, 
it should be noted that these embodiments are illustrative and are not to be construed as 
restricting the invention in any way. 
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EXAMPLES 
EXAMPLE 1 

Source of Biological Materials and Overview of Novel Polynucleotides 
Expressed by the Biological Materials 

5 

Cell lines and human normal and tumor tissue were used to construct 
cDNA libraries from mRNA isolated from the cells and tissues. Most sequences were 
about 275-300 nucleotides in length. The cells lines include Kml2L4-A cell line, a 
high metastatic colon cancer cell line (Morika, W. A. K. et al., Cancer Research (1988) 

10 45:6863). The KM12L4-A cell line is derived from the KM12C cell line. The KM12C 
cell line, which is poorly metastatic (low metastatic) was established in culture from a 
Dukes' stage B2 surgical specimen (Morikawa et al. Cancer Res. (1988) 45:6863). The 
KML4-A is a highly metastatic subline derived from KM12C (Yeatman et al. NucL 
Acids. Res. (1995) 25:4007; Bao-Ling et al. Proc. Annu. Meet. Am. Assoc. Cancer. Res. 

15 (1995) 27:3269). The KM12C and KM12C-derived cell lines (e.g., KM12L4, 
KM12L4-A, etc.) are well-recognized in the art as model cell lines for the study of 
colon cancer (see, e.g., Moriakawa et al., supra; Radinsky et al. Clin. Cancer Res. 
(1995) 1:19; Yeatman et al., (1995) supra; Yeatman et al., Clin. Exp. Metastasis (1996) 
14:246). These and other cell lines and tissue are described in Table 6. 

20 The sequences of the isolated polynucleotides were first masked to 

eliminate low complexity sequences using the XBLAST masking program (Claverie 
"Effective Large-Scale Sequence Similarity Searches," In: Computer Methods for 
Macromolecular Sequence Analysis , Doolittle, ed., Meth. Enzymol. 255:212-227 
Academic Press, NY, NY (1996); see particularly Claverie, in "Automated DNA 

25 Sequencing and Analysis Techniques" Adams et al., eds., Chap. 36, p. 267 Academic 
Press, San Diego, 1994 and Claverie et al. Comput. Chem. (1993) 77:191 ). Generally, 
masking does not influence the final search results, except to eliminate sequences of 
relative little interest due to their low complexity, and to eliminate multiple "hits" based 
on similarity to repetitive regions common to multiple sequences, e.g., Alu repeats. The 

30 sequences remaining after masking were then used in a BLASTN vs. Genbank search; 
sequences that exhibited greater than 70% overlap, 99% identity, and a p value of less 
than 1 x 10" 40 were discarded. Sequences from this search also were discarded if the 
inclusive parameters were met, but the sequence was ribosomal or vector-derived. 

<5l 
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The resulting sequences from the previous search were classified into 
three groups (1, 2 and 3 below) and searched in a BLASTX vs. NRP (non-redundant 
proteins) database search: (1) unknown (no hits in the Genbank search), (2) weak 
similarity (greater than 45% identity and p value of less than 1 x 10* 5 ), and (3) high 
5 similarity (greater than 60% overlap, greater than 80% identity, and p value less than 1 
x 10" 5 ). Sequences having greater than 70% overlap, greater than 99% identity, and p 
value of less than 1 x 1 0* 40 were discarded. 

The remaining sequences were classified as unknown (no hits), weak 
similarity, and high similarity (parameters as above). Two searches were performed on 

10 these sequences. First, a BLAST vs. EST database search was performed and 
sequences with greater than 99% overlap, greater than 99% similarity and a p value of 
less than 1 x 1 0" 40 were discarded. Sequences with a p value of less than 1 x 1 0" 65 when 
compared to a database sequence of human origin were also excluded. Second, a 
BLASTN vs. Patent GeneSeq database was performed and sequences having greater 

15 than 99% identity, p value less than 1 x 10^°, and greater than 99% overlap were 
discarded. 

The remaining sequences were subjected to screening using other rules 
and redundancies in the dataset. Sequences with a p value of less than 1 x 10~ m in 
relation to a database sequence of human origin were specifically excluded. The final 

20 result provided the 3351 sequences listed in the accompanying Sequence Listing. Each 
identified polynucleotide represents sequence from at least a partial mRNA transcript. 
Polynucleotides that were determined to be novel were assigned a sequence 
identification number. 

The novel polynucleotides were assigned sequence identification numbers 

25 SEQ ID NOs:l-3351. The first 1847 DNA sequences corresponding to the novel 
polynucleotides are provided in the Sequence Listing in Table 1. DNA sequences 
corresponding to the novel polynucleotides of SEQ ID NOs: 1 848-335 1 are provided in the 
Sequence Listing in Table 2. The DNA sequences of Table 2, while numbered SEQ ID 1- 
1504, correspond to SEQ ID NOs: 1848-3351 in the Sequence Listing, e.g., Table 2 SEQ ID 

30 1 is SEQ ID NO: 1848, Table 2 SEQ ID 2 is SEQ ID NO: 1849, etc. Each DNA sequence in 
Table 4 is uniquely identified by a number that is 1847 less than its SEQ ID NO in the 
Sequence Listing. Tables 1 and 2 provide: 1) the SEQ ID NO assigned to each sequence 
for use in the present specification or a corresponding number; 2) the sequence name used 
as an internal identifier of the sequence; 3) the name assigned to the clone from which the 
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sequence was isolated; and 4) the number of the cluster to which the sequence is assigned 
(Cluster ID; where the cluster ID is 0, the sequence was not assigned to any cluster). 

Because the provided polynucleotides represent partial mRNA 
transcripts, two or more polynucleotides of the invention may represent different 
5 regions of the same mRNA transcript and the same gene. Thus, if two or more SEQ ID 
NOs: are identified as belonging to the same clone, then either sequence can be used to 
obtain the full-length mRNA or gene. 



EXAMPLE 2 

Results of Public Database Search to Identify Function of Gene Products 

10 

SEQ ID NOs: 1-3351 were translated in all three reading frames to 
determine the best alignment with the individual sequences. These amino acid 
sequences and nucleotide sequences are referred to, generally, as query sequences, 
which are aligned with the individual sequences. Query and individual sequences were 

15 aligned using the BLAST programs, available over the world wide web at 
http://www.ncbi.nlm.nih.gov/BLAST/. Again the sequences were masked to various 
extents to prevent searching of repetitive sequences or poly-A sequences, using the 
XBLAST program for masking low complexity as described above in Example 1. 

Tables 3 and 4 (inserted before the claims) show the results of the 

20 alignments. Table 3 contains alignment information for SEQ ID NOs: 1-1 847 and Table 4 
contains alignment information for SEQ ID NOs: 1848-3351. The DNA sequences of Table 
4, while numbered SEQ ID 1-1504, correspond to SEQ ID NOs: 1848-3351. Each DNA 
sequence in Table 4 is uniquely identified by a number that is 1847 less than its SEQ ID 
NO. Tables 3 and 4 refer to each sequence by its SEQ ID NO or a corresponding number, 

25 the accession numbers and descriptions of nearest neighbors from the Genbank and Non- 
Redundant Protein searches, and the p values of the search results. 

For each of SEQ ID NOs:l-1847, the best alignment to a protein or DNA 
sequence is included in Table 3, and the best alignment for each of SEQ ID NOs: 1848- 
3351 is included in Table 4. The activity of the polypeptide encoded by SEQ ID 

30 NOs:l-3351 is the same or similar to the nearest neighbor reported in Table 3 or 4. The 
accession number of the nearest neighbor is reported, providing a reference to the activities 
exhibited by the nearest neighbor. The search program and database used for the alignment 
also are indicated as well as a calculation of the p value. 
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Full length sequences or fragments of the polynucleotide sequences of 
the nearest neighbors can be used as probes and primers to identify and isolate the full 
length sequence of SEQ ID NOs:l-3351. The nearest neighbors can indicate a tissue or 
cell type to be used to construct a library for the full-length sequences of SEQ ID 
5 NOs:l-3351. 

EXAMPLE 3 
Members of Protein Families 

The sequences (SEQ ID NOs:l-3351) were used to conduct a profile 
10 search as described in the specification above. Several of the polynucleotides of the 
invention were found to encode polypeptides having characteristics of a polypeptide 
belonging to a known protein families (and thus represent new members of these 
protein families) and/or comprising a known functional domain (Table 5). "Start" and 
"stop" in Table 3 indicate the position within the individual sequences that align with 
15 the query sequence having the indicated SEQ ID NO. The direction indicates the 
orientation of the query sequence with respect to the individual sequence, where 
forward (for) indicates that the alignment is in the same direction (left to right) as the 
sequence provided in the Sequence Listing and reverse (rev) indicates that the 
alignment is with a sequence complementary to the sequence provided in the Sequence 
20 Listing. 

Some polynucleotides exhibited multiple profile hits because, for 
example, the particular sequence contains overlapping profile regions, and/or the 
sequence contains two different functional domains. These profile hits are described in 
more detail below. 

25 Ank Reneats (ANK) . SEQ ID NOs:187, 1268, 1804, 1819, 1830, 1839, 

2652, 3015 and 3267 represent polynucleotides encoding an Ank repeat-containing 
protein. The ankyrin motif is a 33 amino acid sequence named for the protein ankyrin 
which has 24 tandem 33-amino-acid motifs. Ank repeats were originally identified in 
the cell-cycle-control protein cdclO (Breeden et al., Nature (1987) 329:651). Proteins 

30 containing ankyrin repeats include ankyrin, myotropin, I-kappaB proteins, cell cycle 
protein cdclO, the Notch receptor (Matsuno et al., Development (1997) i24(27):4265); 
G9a (or BAT8) of the class III region of the major histocompatibility complex 
(Biochem J. 290:811-818, 1993), FABP, GABP, 53BP2, Linl2, glp-1, SW14, and 
SW16. The functions of the ankyrin repeats are compatible with a role in protein- 
Si 
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protein interactions (Bork, Proteins (1993) 77(4):363; Lambert and Bennet, Eur. J. 
Biochem. (1993) 277:1; Kerr et al., Current Op. Cell Biol. (1992) 4:496; Bennet et al., 
J. Biol. Chem. (1980) 255:6424). 

ATPases Associated with Various Cellula r Activities fATPases). 

5 Sequences within SEQ ID NOs:431, 639, 2135, 2684, 2859, 3197 and 3266 correspond 
to a sequence that encodes a novel member of the "ATPases Associated with diverse 
cellular Activities" (AAA) protein family. The AAA protein family is composed of a 
large number of ATPases that share a conserved region of about 220 amino acids that 
contains an ATP-binding site (Froehlich et al, J. Cell Biol. (1991) 774:443; Erdmann et 

10 al. Cell (1991) 64:499; Peters et al, EMBO J. (1990) 9:1757; Kunau et al, Biochimie 
(1993) 75:209-224; Confalonieri et al, BioEssays (1995) 77:639; 
http://yeamob.pci.chemie.uni-tuebingen.de/AAA/Description.html). The proteins that 
belong to this family either contain one or two AAA domains. In general, the AAA 
domains in these proteins act as ATP-dependent protein clamps (Confalonieri et al. 

15 (1995) BioEssays 1 7:639). In addition to the ATP-binding 'A' and 'B' motifs, which are 
located in the N-terminal half of this domain, there is a highly conserved region located 
in the central part of the domain which was used in the development of the signature 
pattern. The consensus pattern is: [LIVMT]-x-[LIVMT]-[LIVMF]-x-[GATMC]-[ST]- 
[NS]-x(4)-[LIVM]- D-x-A-[LIFA]-x-R. 

20 Bromodomain rhromodomainV SEQ ID NO: 181 4 represents a 

polynucleotide encoding a polypeptide having a bromodomain region (Haynes et al, 
1992, Nucleic Acids Res. 20:2693-2603, Tamkun et al, 1992, Cell 68:561-572, and 
Tamkun, 1995, Curr. Opin. Genet. Dev. 5:473-477), which is a conserved region of 
about 70 amino acids. The bromodomain is thought to be involved in protein-protein 

25 interactions and may be important for the assembly or activity of multicomponent 
complexes involved in transcriptional activation. The consensus pattern, which spans a 
major part of the bromodomain, is: [STANVF]-x(2)-F-x(4)-[DNS]-x(5,7)-[DENQTF]- 
Y-[HF Y]-x(2)- [LIVMFY]-x(3)-[LIVM]-x(4)-[LIVM]-x(6,8)-Y-x(l 2,1 3)-[LIVM]-x(2)- 
N-[SACF]-x(2)-[FY]. 

30 Basic Region Plus Leucine Zipper Transcriptio n Factors (BZIP). SEQ 

IDNOs:410, 552, 768, 822, 836, 1288, 1365, 1454, 1540, 1549, 1556, 1557, 1563, 
1622, 1630, 1704, 1808, 2363, 2424, 3147, 3152, 3158 and 3208 represent 
polynucleotides encoding a novel member of the family of basic region plus leucine 
zipper transcription factors. The bZIP superfamily (Hurst, Protein Prof. (1995) 2:105; 

35 and Ellenberger, Curr. Opin. Struct. Biol. (1994) 4:12) of eukaryotic DNA-binding 

56 
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transcription factors encompasses proteins that contain a basic region mediating 
sequence-specific DNA-binding followed by a leucine zipper required for dimerization. 
The consensus pattern for this protein family is: [KR]-x(l,3)-[RKSAQ]-N-x(2)- 

[SAQ](2)-x-[RKTAENQ]-x-R-x-[RK]. 

5 EF Hand fEFhand) . SEQ ID NOs:820, 1755 and 3285 correspond to 

polynucleotides encoding a novel protein in the family of EF-hand proteins. Many 
calcium-binding proteins belong to the same evolutionary family and share a type of 
calcium-binding domain known as the EF-hand (Kawasaki et al., Protein. Prof. (1995) 
2:305-490). This type of domain consists of a twelve residue loop flanked on both sides 

10 by a twelve residue alpha-helical domain. In an EF-hand loop the calcium ion is 
coordinated in a pentagonal bipyramidal configuration. The six residues involved in the 
binding are in positions 1, 3, 5, 7, 9 and 12; these residues are denoted by X, Y, Z, -Y, 
-X and -Z. The invariant Glu or Asp at position 12 provides two oxygens for liganding 
Ca (bidentate ligand). The consensus pattern includes the complete EF-hand loop as 

15 well as the first residue which follows the loop and which seem to always be 
hydrophobic: D-x-[DNS]-{ILVFYW}-[DENSTG]-[DNQGHRK]-{GP}-[LIVMC]- 
[DENQSTAGC]-x(2)-[DE]-[LIVMFYW]. 

Ets Domain (Ets Nterm) . SEQ ID NO: 1811 represents a polynucleotide 
encoding a polypeptide with N-terminal homology in ETS domain. Proteins of this 

20 family contain a conserved domain, the "ETS-domain," that is involved in DNA 
binding. The domain appears to recognize purine-rich sequences; it is about 85 to 90 
amino acids in length, and is rich in aromatic and positively charged residues (Wasylyk, 
et al., Eur. J. Biochem. (1993) 277:718). The ets gene family encodes a novel class of 
DNA-binding proteins, each of which binds a specific DNA sequence and comprises an 

25 ets domain that specifically interacts with sequences containing the common core tri- 
nucleotide sequence GGA. In addition to an ets domain, native ets proteins comprise 
other sequences which can modulate the biological specificity of the protein. Ets genes 
and proteins are involved in a variety of essential biological processes including cell 
growth, differentiation and development, and three members are implicated in 

30 oncogenic process. 

G-Protein Alnha Suhunit (G-alpha) . SEQ ID NO: 1846 represents a 
polynucleotide encoding a novel polypeptide of the G-protein alpha subunit family. 
Guanine nucleotide binding proteins (G-proteins) are a family of membrane-associated 
proteins that couple extracellularly-activated integral-membrane receptors to 

35 intracellular effectors, such as ion channels and enzymes that vary the concentration of 
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second messenger molecules. G-proteins are composed of 3 subunits (alpha, beta and 
gamma) which, in the resting state, associate as a trimer at the inner face of the plasma 
membrane. The alpha subunit binds GTP and exhibits GTPase activity. G-protein alpha 
subunits are 350-400 amino acids in length and have molecular weights in the range 40- 
5 45 kDa. Seventeen distinct types of alpha subunit have been identified in mammals, 
and fall into 4 main groups on the basis of both sequence similarity and function: alpha- 
s s alpha-q, alpha-i and alpha-12 (Simon et al., Science (1993) 252:802). They are often 
N-terminally acylated, usually with myristate and/or palmitoylate, and these fatty acid 
modifications can be important for membrane association and high- affinity interactions 

1 0 with other proteins. 

Helicases conserved C-terminal domain (helicase _C\ SEQ ID 
NOs:1496, 2826 and 2871 represent polynucleotides encoding novel members of the 
DEAD/H helicase family. A number of eukaryotic and prokaryotic proteins have been 
characterized (Schmid S.R., et al., Mol Microbiol (1992) 6:283; Linder P., et al., 

15 Nature (1989) 557:121; Wassarman D.A., et al., Nature (1991) 549:463) on the basis of 
their structural similarity. All are involved in ATP-dependent, nucleic-acid unwinding. 
All DEAD box family members of the above proteins share a number of conserved 
sequence motifs, some of which are specific to the DEAD family while others are 
shared by other ATP-binding proteins or by proteins belonging to the helicases 

20 'superfamily' (Hodgman T.C., Nature (1988) 555:22 and Nature (1988) 555:578 
(Errata). One of these motifs, called the "D-E-A-D-box'\ represents a special version of 
the B motif of ATP-binding proteins. Some other proteins belong to a subfamily which 
have His instead of the second Asp and are thus said to be "D-E-A-H-box" proteins 
(Wassarman D.A., et al., Nature (1991) 549:463; Harosh I., et al., Nucleic Acids Res. 

25 (1991) 79:6331; Koonin E.V. et al., J. Gen. Virol (1992) 75:989. The following 
signature patterns are used to identify members of both subfamilies: 1) [LIVMF](2)-D- 
E-A-D-[RKEN]-x-[LIVMFYGSTN]; and 2) [GSAH]-x-[LIVMF](3)-D-E-[ALIV]-H- 
[NECR]. 

Homeobox domain ftiomeobox) . SEQ ID NOs:1676, 1820 and 1821 
30 represent polynucleotides encoding proteins having a homeobox domain. The 
homeobox is a protein domain of 60 amino acids (Gehring In: Guidebook to the 
Homeobox Genes , Duboule D., Ed., pp. 1-10, Oxford University Press, Oxford, (1994); 
Buerglin In: Guidebook to the Homeobox Genes . pp25-72, Oxford University Press, 
Oxford, (1994); Gehring, Trends Biochem, Scl (1992) 17:277-280; Gehring et al., 
35 Annu. Rev. Genet. (1986) 20:147-173; Schofield, Trends Neuroscl (1987) 70:3-6) first 
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identified in a number of Drosophila homeotic and segmentation proteins. It is 
extremely well conserved in many other animals, including vertebrates. This domain 
binds DNA through a helix-turn-helix type of structure. Several proteins that contain a 
homeobox domain play an important role in development. Most of these proteins are 

5 sequence-specific DNA-binding transcription factors. The homeobox domain is also 
very similar to a region of the yeast mating type proteins. These are sequence-specific 
DNA-binding proteins that act as master switches in yeast differentiation by controlling 
gene expression in a cell type-specific fashion. 

A schematic representation of the homeobox domain is shown below. 

1 0 The helix-tum-helix region is shown by the symbols 'H* (for helix), and T (for turn). 

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxHHHHHHHHtttHHHHHHHHHxxxxxxxxxx 
^ 60 

l5 The pattern detects homeobox sequences 24 residues long and spans 

positions 34 to 57 of the homeobox domain. The consensus pattern is as follows: 
[LIVMFYG]-[ASLVR]-x(2)-[LIVMSTACN]-x-[LIVM]-x(4)-[LIV]-[RKNQESTAIY]- 

[LIVFSTNKH]-W-[FYVC]-x-[NDQTAH]-x(5)-[RKNAIMW]. 

MAP kinase kinase (mkk) . SEQ ID NOs:29, 31, 196, 3175, 3190 and 

20 3281 represent novel members of the MAP kinase kinase family. MAP kinases 
(MAPK) are involved in signal transduction, and are important in cell cycle and cell 
growth controls. The MAP kinase kinases (MAPKK) are dual-specificity protein 
kinases which phosphorylate and activate MAP kinases. MAPKK homologues have 
been found in yeast, invertebrates, amphibians, and mammals. Moreover, the 

25 MAPKK/MAPK phosphorylation switch constitutes a basic module activated in distinct 
pathways in yeast and in vertebrates. MAPKKs are essential transducers through which 
signals must pass before reaching the nucleus. For review, see, e.g., Biologique Biol 
Cell (1993) 79:193-207; Nishida et al., Trends Biochem Sci (1993) 75:128-31; 
Ruderman, Curr Opin Cell Biol (1993) 5:207-13; Dhanasekaran et al., Oncogene (1998) 

30 77:1447-55; Kieferetal., Biochem Soc Trans (1997) 25:491-8; and Hill, Cell Signal 
(1996) S:533-44. 

Protein Kinase (orotkinase) . SEQ ID NOs:l 157, 1478, 1496, 2286, 2969 
and 3190 represent polynucleotides encoding protein kinases. Protein kinases catalyze 
phosphorylation of proteins in a variety of pathways, and are implicated in cancer. 
35 Eukaryotic protein kinases (Hanks S.K., et al., FASEBJ. (1995) 9:576; Hunter T., Meth. 
Enzymol. (1991) 200:3; Hanks S.K., et al., Meth. Enzymol. (1991) 200:38; Hanks S.K., 
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Curr. Opin. Struct Biol (1991) 7:369; Hanks S.K. et al., Science (1988) 241:42) are 
enzymes that belong to a very extensive family of proteins which share a conserved 
catalytic core common to both serine/threonine and tyrosine protein kinases. There are 
a number of conserved regions in the catalytic domain of protein kinases. The first 
5 region, which is located in the N-terminal extremity of the catalytic domain, is a 
glycine-rich stretch of residues in the vicinity of a lysine residue, which has been shown 
to be involved in ATP binding. The second region, which is located in the central part 
of the catalytic domain, contains a conserved aspartic acid residue which is important 
for the catalytic activity of the enzyme (Knighton D.R. et al., Science (1991) 253:407). 

10 The protein kinase profile includes two signature patterns for this second region: one 
specific for serine/threonine kinases and the other for tyrosine kinases. A third profile 
is based on the alignment in (Hanks S.K. et al., FASEB J. (1995) 9:576) and covers the 
entire catalytic domain. 

The consensus patterns are as follows: 1) [LIV]-G-{P}-G-{P}- 

15 [FYWMGSTNH]-[SGA]-{PW}-[LIVCAT]-{PD}-x-[GSTACLIVMFY]-x(5,18)- 

[LIVMFYWCSTAR]-[A1VP].[LIVMFAGCKR]-K, where K binds ATP; 2) 
[LIVMFYC]-x-[HY]-x-D-[LIVMFY]-K-x(2)-N-[LIVMFYCT](3) 5 where D is an active 
site residue; and 3) [LIVMFYC]-x-[HY].x-D.[LIVMFY]-[RSTAC]-x(2)-N- 
[LIVMFYC], where D is an active site residue. 

20 If a protein analyzed includes two of the above protein kinase signatures, 

the probability of it being a protein kinase is close to 100%. 

Ras family proteins (ras) . SEQ ID NOs:1688 and 3258 represent 
polynucleotides encoding novel members of the ras family of small GTP/GDP-binding 
proteins (Valencia et al., 1991, Biochemistry 30:4637-4648). Ras family members 

25 generally require a specific guanine nucleotide exchange factor (GEF) and a specific 
GTPase activating protein (GAP) as stimulators of overall GTPase activity. Among 
ras-related proteins, the highest degree of sequence conservation is found in four 
regions that are directly involved in guanine nucleotide binding. The first two 
constitute most of the phosphate and Mg2+ binding site (PM site) and are located in the 

30 first half of the G-domain. The other two regions are involved in guanosine binding and 
are located in the C-terminal half of the molecule. Motifs and conserved structural 
features of the ras-related proteins are described in Valencia et al., 1991, Biochemistry 
30:4637-4648. A major consensus pattern of ras proteins is: D-T-A-G-Q-E-K-[LF]-G- 
G-L-R-[DE]-G-Y-Y. 
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Thioredoxin family active, site rThioredox) . SEQ ID NO: 1677 represents 
a polynucleotide encoding a protein having a thioredoxin family active site. 
Thioredoxins (Holmgren A., Annu. Rev. Biochem. (1985) 5*237; Gleason F.K. ct al., 
FEMS Microbiol. Rev. (1988) 54:271; Holmgren, A. J. Biol. Chem. (1989) 2<W:13963; 

5 Eklund H. et al., Proteins (1991) 77:13) are small proteins of approximately one 
hundred amino- acid residues which participate in various redox reactions via the 
reversible oxidation of an active center disulfide bond. They exist in either a reduced 
form or an oxidized form where the two cysteine residues are linked in an 
intramolecular disulfide bond. Thioredoxin is present in prokaryotes and eukaryotes 

10 and the sequence around the redox-active disulfide bond is well conserved. All PDI 
contains two or three (ERp72) copies of the thioredoxin domain. The consensus pattern 
is: [L IVMF]-[LlVMSTA]-x-[LIVMFYC]-[FYWSTHE]-x(2)-[FYWGTN]-C- 
[GATPLVE]-[PHYWSTA]-C-x(6)-[LIVMFYWT] (where the two C's form the redox- 
active bond). 

15 Trypsin (trypsin) . SEQ ID NO: 1410 corresponds to a novel serine 

protease of the trypsin family. The catalytic activity of the serine proteases from the 
trypsin family is provided by a charge relay system involving an aspartic acid residue 
hydrogen-bonded to a histidine, which itself is hydrogen-bonded to a serine. The 
sequences in the vicinity of the active site serine and histidine residues are well 

20 conserved in this family of proteases (Brenner S., Nature (1988) 354:528). The 
consensus patterns for this trypsin protein family are: 1) [LIVM]-[ST]-A-[STAG]-H-C, 
where H is the active site residue; and 2) [DNSTAGC]-[GSTAPIMVQH]-x(2)-G-[DE]- 
S-G-[GS]-[SAPHV]- [LIVMFYWH]-[LIVMFYSTANQH], where S is the active site 
residue. All sequences known to belong to this family are detected by the above 

25 consensus sequences, except for 18 different proteases which have lost the first 
conserved glycine. If a protein includes both the serine and the histidine active site 
signatures, the probability of it being a trypsin family serine protease is 100%. 

WD Domain. G-Beta Repeats (WD domain) . SEQ IDNOs:1336, 1380, 
1711, 1762, 1909, 2218, 3047, 3108 and 3292 represent novel members of the WD 

30 domain/G-beta repeat family. Beta-transducin (G-beta) is one of the three subunits 
(alpha, beta, and gamma) of the guanine nucleotide-binding proteins (G proteins) which 
act as intermediaries in the transduction of signals generated by transmembrane 
receptors (Oilman, Annu. Rev. Biochem. (1987) 56:615). The alpha subunit binds to 
and hydrolyzes GTP; the functions of the beta and gamma subunits are less clear but 

35 they seem to be required for the replacement of GDP by GTP as well as for membrane 
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anchoring and receptor recognition. In higher eukaryotes, G-beta exists as a small 
multigene family of highly conserved proteins of about 340 amino acid residues. 
Structurally, G-beta consists of eight tandem repeats of about 40 residues, each 
containing a central Trp-Asp motif (this type of repeat is sometimes called a WD-40 
5 repeat). The consensus pattern for the WD domain/G-Beta repeat family is: 
[LIVMSTAC]-[LIVMFYWSTAGC]-[LIMSTAG]-[LIVMSTAGC]-x(2)-[DN]-x(2)- 
[LIVMWSTAC]-x-[LIVMFSTAG]-W-[DEN]-[LIVMFSTAGCN]. 

wnt Family of Developmental Signalin p Proteins (Wnt dev sign). SEQ 
ID NO: 1538 corresponds to a novel member of the wnt family of developmental 
10 signaling proteins. Wnt-1 (previously known as int-1), the seminal member of this 
family, (Nusse R., Trends Genet. (1988) 4:291) is thought to play a role in intercellular 
communication and seems to be a signalling molecule important in the development of 
the central nervous system (CNS). All wnt family proteins share the following features 
characteristics of secretory proteins: a signal peptide, several potential N-glycosylation 
15 sites and 22 conserved cysteines that are probably involved in disulfide bonds. The 
Wnt proteins seem to adhere to the plasma membrane of the secreting cells and are 
therefore likely to signal over only few cell diameters. The consensus pattern, which is 
based upon a highly conserved region including three cysteines, is as follows: C-K-C- 
H-G-[LIVMT]-S-G-x-C. 
20 Protein Tyrosine Phosphatase (Y phosphatase) . SEQ ID NO: 141 7 

represents a polynucleotide encoding a protein tyrosine kinase. Tyrosine specific 
protein phosphatases (EC 3.1.3.48) (PTPase) (Fischer et al., Science (1991) 253:401; 
Charbonneau et al., Annu. Rev. Cell Biol. (1992) 5:463; Trowbridge, J. Biol. Chem. 
(1991) 265:23517; Tonks et al., Trends Biochem. Sci. (1989) 74:497; and Hunter, Cell 
25 (1989) 55:1013) catalyze the removal of a phosphate group attached to a tyrosine 
residue. These enzymes are very important in the control of cell growth, proliferation, 
differentiation and transformation. Multiple forms of PTPase have been characterized 
and can be classified into two categories: soluble PTPases and transmembrane receptor 
proteins that contain PTPase domain(s). Structurally, all known receptor PTPases are 
30 made up of a variable length extracellular domain, followed by a transmembrane region 
and a C-terminal catalytic cytoplasmic domain. PTPase domains consist of about 300 
amino acids. The search of two conserved cysteines has been shown to be absolutely 
required for activity. Furthermore, a number of conserved residues in its immediate 
vicinity have also been shown to be important. The consensus pattern for PTPases is: 
35 [LIVMF]-H-C-x(2)-G-x(3)-[STC]-[STAGP]-x-[LIVMFY]; C is the active site residue. 
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7inr. Finger. C2H2 Type fZincfing C2H2) . SEQ ID NOs:308, 807, 
1324, 1503, 1527, 3081, 3193 and 3306 correspond to polynucleotides encoding novel 
members of the of the C2H2 type zinc finger protein family. Zinc finger domains (Klug 
et al., Trends Biochem. Sci. (1987) 72:464; Evans et al., Cell (1988) 52:1; Payre et al., 
5 FEBS Lett. (1988) 254:245; Miller et al., EMBOJ. (1985) 4:1609; and Berg, Proc. Natl. 
Acad. Sci. USA (1988) 55:99) are nucleic acid-binding protein structures. In addition to 
the conserved zinc ligand residues, it has been shown that a number of other positions 
are also important for the structural integrity of the C2H2 zinc fingers. (Rosenfeld et al., 
J. Biomol. Struct. Dyn. (1993) 77:557) The best conserved position is found four 
10 residues after the second cysteine; it is generally an aromatic or aliphatic residue. The 
consensus pattern for C2H2 zinc fingers is: C-x(2,4)-C-x(3)-[LIVMFYWC]-x(8)-H- 
x(3,5)-H. The two C's and two H's are zinc ligands. 

Src homology 2 . SEQ ID NOs:186, 2591, 3307 and 3339 represent 
polynucleotides encoding novel members of the family of Src homology 2 (SH2) 
15 proteins. The Src homology 2 (SH2) domain is a protein domain of about 100 amino 
acid residues first identified as a conserved sequence region between the oncoproteins 
Src and Fps (Sadowski I. et al., Mol. Cell. Biol. 6:4396-4408 (1986)). Similar sequences 
are found in many other intracellular signal-transducing proteins (Russel R.B. et al., 
FEBS Lett. 504:15-20 (1992)). SH2 domains function as regulatory modules of 
20 intracellular signalling cascades by interacting with high affinity to phosphotyrosine- 
containing target peptides in a sequence-specific and phosphorylation-dependent 
manner (Marangere L.E.M., Pawson T., J. Cell Sci. Suppl. 75:97-104 (1994); Pawson 
T., Schlessinger J., Curr. Biol. 5:434-442 (1993); Mayer B.J., Baltimore D., Trends 
Cell. Biol. 5:8-13 (1993); Pawson T., Nature 575:573-580 (1995)). 
25 The SH2 domain has a conserved 3D structure consisting of two alpha 

helices and six to seven beta-strands. The core of the domain is formed by a continuous 
beta-meander composed of two connected beta-sheets (Kuriyan J., Cowburn D., Curr. 
Opin. Struct. Biol. 5:828-837(1993)). The profile to detect SH2 domains is based on a 
structural alignment consisting of 8 gap-free blocks and 7 linker regions totaling 92 

30 match positions. 

Src homology 3. SEQ ID NO:234, 1832, and 1835 represent 
polynucleotides encoding novel members of the family of Src homology 3 (SH3) 
proteins. The Src homology 3 (SH3) domain is a small protein domain of about 60 
amino acid residues first identified as a conserved sequence in the non-catalytic part of 

35 several cytoplasmic protein tyrosine kinases (e.g., Src, Abl, Lck) (Mayer B.J. et al., 
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Nature 332:272-275 (1988)). Since then, it has been found in a great variety of other 
intracellular or membrane-associated proteins (Musacchio A. et al., FEBS Lett. 307:55- 
61 (1992); Pawson T., Schlessinger J., Curr. Biol. 3:434-442 (1993); Mayer BJ., 
Baltimore D., Trends Cell Biol. 3:8-13 (1993); Pawson T., Nature 573:573-580 (1995)). 
5 The SH3 domain has a characteristic fold which consists of five or six 

beta strands arranged as two tightly packed anti-parallel beta sheets. The linker regions 
may contain short helices (Kuriyan J., Cowbum D., Curr. Opin. Struct. Biol 3:828-837 
(1993)). 

The function of the SH3 domain may be to mediate assembly of specific 
10 protein complexes via binding to proline-rich peptides (Morton C.J., Campbell I.D., 
Curr. Biol 4:615-617(1994)). 

In general SH3 domains are found as single copies in a given protein, but 
there are a significant number of proteins with two SH3 domains and a few with 3 or 4 
copies. 

15 Fibronectin type III. SEQ ID NOs:746 and 1192 represent 

polynucleotides encoding novel members of the family of fibronectin type III proteins. 
A number of receptors for lymphokines, hematopoeitic growth factors and growth 
hormone-related molecules have been found to share a common binding domain. 
(Bazan J.F., Biochem. Biophys. Res. Commun. 764:788-795 (1989); Bazan J.F., Proc. 

20 Natl Acad Sci. U.S.A. 57:6934-6938 (1990); Cosman D. et al., Trends Biochem. Sci. 
75:265-270 (1990); d'Andrea A.D., Fasman G.D., Lodish H.F., Cell 55:1023-1024 

(1989) ; d'Andrea A.D., Fasman G.D., Lodish H.F., Curr. Opin. Cell Biol 2:648-651 

(1990) ). 

The conserved region constitutes all or part of the extracellular ligand- 
25 binding region and is about 200 amino acid residues long. In the N-terminal of this 
domain there are two pairs of cysteines known, in the growth hormone receptor, to be 
involved in disulfide bonds. 

+ - - xxxxxxx - + 

30 |C C C C Extracellular XXXXXXX Cytoplasmic | + 

-|-| 1--| - xxxxxxx + 

|| || Transmembrane 

+-+ +--+ 

35 Two patterns detect this family of receptors. The first one is derived 

from the first N-terminal disulfide loop, the second is a tryptophan-rich pattern located 
at the C-terminal extremity of the extracellular region. 
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A consensus for this protein family is: C-[LVFYR]-x(7,8)-[STIVDN]-C~ 
x-W (The two C's are linked by a disulfide bond]. A second consensus for this protein 
family is: [STGL]-x-W-[SG]-x-W-S. 

LIM domain containing proteins. SEQ ID NOs:1269, 1309, 1360, and 
5 1386 represent polynucleotides encoding novel members of the family of LIM domain 
containing proteins. A number of proteins contain a conserved cysteine-rich domain of 
about 60 amino-acid residues. (Freyd G. et al., Nature J44:876-879 (1990); Baltz R. et 
al., Plant Cell 4: 1465-1466 (1992); Sanchez-Garcia L, Rabbitts T.H., Trends Genet. 
70:315-320(1994)). 

10 In the LIM domain, there are seven conserved cysteine residues and a 

histidine. The arrangement followed by these conserved residues is C-x(2)-C- x( 16,23)- 
H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3>[CHD]. The LIM domain binds two zinc 
ions (Michelsen J.W. et al., Proc. Natl Acad Sci. U.S.A. 90:4404-4408 (1993)). LIM 
does not bind DNA, rather it seems to act as interface for protein-protein interaction. 

15 The consensus for this protein family is: C-x(2)-C-x( 15,21 )-[FY WH]-H-x(2)-[CH]- 
x(2)-C-x(2)-C-x(3)- [LIVMF]. The 5 C's and the H bind zinc. 

C2 domain (protein kinase C like). SEQ ID NOs:1325 and 2282 
represent polynucleotides encoding novel members of the family of C2 domain 
containing proteins. Some isozymes of protein kinase C (PKC) contain a domain, 

20 known as C2, of about 116 amino-acid residues, which is located between the two 
copies of the CI domain (that bind phorbol esters and diacylglycerol) and the protein 
kinase catalytic domain. (Azzi A. et al., Eur. J. Biochem. 205:547-557 (1992); Stabel S., 
Semin. Cancer Biol. 5:277-284 (1994)). 

The C2 domain is involved in calcium-dependent phospholipid binding 

25 (Davletov B.A., Suedhof T.C., J. Biol Chem. 265:26386-26390 (1993)). Since 
domains related to the C2 domain are also found in proteins that do not bind calcium, 
other putative functions for the C2 domain include binding to inositol- 1,3,5- 
tetraphosphate. (Fukuda M., et al., J. Biol Chem. 2(59:29206-2921 1 (1994).) 

The consensus pattern for the C2 domain is located in a conserved part 

30 of that domain, the connecting loop between beta strands 2 and 3. The profile for the C2 
domain covers the total domain. The consensus for this protein family is:: [ACG]«x(2)- 
L-x(2,3)-D-x(l,2)-[NGSTLIF]-[GTMR]-x-[STAP]-D- [PA]-[FY] 

Serine proteases, trypsin family, active sites. SEQ ID NO:1410 
represents a polynucleotide encoding a novel member of the family of serine protease, 

35 trypsin proteins. The catalytic activity of the serine proteases from the trypsin family is 
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provided by a charge relay system involving an aspartic acid residue hydrogen-bonded 
to a histidine, which itself is hydrogen-bonded to a serine. The sequences in the vicinity 
of the active site serine and histidine residues are well conserved in this family of 
proteases (Brenner S., Nature 334:528-530 (1988)). 
5 A consensus for this protein family is: [LIVM]-[ST]-A-[STAG]-H-C [H 

is the active site residue]. A second consensus for this protein family is: [DNSTAGC]- 
[GSTAPIMVQH]-x(2)-G-[DE]-S-G-[GS]-[SAPHV]- [LIVMFYWH]- 

[LIVMFYSTANQH] [S is the active site residue]. 

RNA Recognition Motif Domai n (RRM. RBD. or RNP). SEQ ID NOs: 
10 1464 and 1514 represent polynucleotides encoding novel members of the family of 

RNA recognition motif domain proteins (Bandziulis R.J. et al., Genes Dev. 3:431-437 

(1989); Dreyfuss G. et al., Trends Biochem. Sci. 73:86-91 (1988)). 

Inside the putative RNA-binding domain there are two regions which are 

highly conserved. The first one is a hydrophobic segment of six residues (which is 
1 5 called the RNP-2 motif); the second one is an octapeptide motif (which is called RNP-1 

or RNP-CS). The position of both motifs in the domain is shown in the following 

schematic representation: 

xxxxxxx###xxxxxxxxxxxxxxxxxxxxxxxxxxxxx#######xxxxxxxxxxxxxxxxxxxxxxxxx 
20 RNP-2 RHP -I 

As a consensus pattern for this type of domain the RNP-1 motif was 
used. The consensus for this protein family is: [RK]-G-{EDRKHPCG}-[AGSCI]- 

[FY]-[LIVA]-x-[FYLM] 

25 Phosphatidvlinositol-spec ific phosphol ipase C. Y Domain. SEQ ID NO: 

1707 represents a polynucleotide encoding a novel member of the phosphatidylinositol- 
specific phospholipase C, Y domain family of proteins. Phosphatidylinositol-specific 
phospholipase C (EC3.1.4.11), a eukaryotic intracellular enzyme, plays an important 
role in signal transduction processes (Meldrum E. et al., Biochim. Biophys. Acta 

30 7092:49-71 (1991)). It catalyzes the hydrolysis of 1-phosphatidyl-D-myo-inositol- 
3,4,5- triphosphate into the second messenger molecules diacylglycerol and inositol- 
1,4,5-triphosphate. This catalytic process is tightly regulated by reversible 
phosphorylation and binding of regulatory proteins (Rhee S.G., Choi K.D., Adv. Second 
Messenger Phosphoprotein Res. 26:35-61 (1992); Rhee S.G., Choi K.D., J. Biol. Chem. 

35 267:12393-12396 (1992); Sternweis P.C., Smrcka A.V., Trends Biochem. Sci. 77:502- 
506 (1992)). 
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All eukaryotic PI-PLCs contain two regions of homology, referred to as 
"X-box" and "Y-box". The order of these two regions is the same (NH2-X-Y-COOH), 
but the spacing is variable. In most isoforms, the distance between these two regions is 
only 50-100 residues but in the gamma isoforms one PH domain, two SH2 domains, 
5 and one SH3 domain are inserted between the two PLC-specific domains. The two 
conserved regions have been shown to be important for the catalytic activity. At the C- 
terminal of the Y-box, there is a C2 domain possibly involved in Ca-dependent 

membrane attachment. 

serine Carhnxvneotidases. SEQ ID NO:1744 represents a 

10 polynucleotide encoding a novel member of the serine carboxypeptidases family of 
proteins. Carboxypeptidases may be either metallo carboxypeptidases or serine 
carboxypeptidases (EC 3.4.16.5 and EC 3.4.16.6). The catalytic activity of the serine 
carboxypeptidases, like that of the trypsin family serine proteases, is provided by a 
charge relay system involving an aspartic acid residue hydrogen-bonded to a histidine, 

1 5 which is itself hydrogen-bonded to a serine (Liao D.I., Remington S.J., J. Biol. Chem. 

265:6528-6531 (1990)). 

The sequences surrounding the active site serine and histidine residues 
are highly conserved in all these serine carboxypeptidases. A consensus for this protein 
family is: [LIVM]-x-[GTA]-E-S-Y-[AG]-[GS] [S is the active site residue]. A second 
20 consensus for this protein family is: [LIVF].x(2)-[LIVSTA]-x-[IVPST]-x-[GSDNQL]- 
[SAGV]-[SG]-H-x- [IVAQ]-P-x(3)-[PSA] [H is the active site residue]. 

rkrm Double-Stranded RNA Rinding Motif. SEQ ID NO:1818 
represents a polynucleotide encoding a novel member of the dsrm double-stranded 
RNA binding motif proteins. In eukaryotic cells, a multitude of RNA-binding proteins 
25 play key roles in the posttranscriptional regulation of gene expression. Characterization 
of these proteins has led to the identification of several RNA-binding motifs. Several 
human and other vertebrate genetic disorders are caused by aberrant expression of 
RNA-binding proteins. (C. G. Burd & G. Dreyfuss, Science 265: 615-621 (1994)). 

Proteins containing double stranded RNA binding motifs bind to specific 
30 RNA targets. Double stranded RNA binding motifs are exemplified by interferon- 
induced protein kinase in humans, which is part of the cellular response to dsRNA. 

SEQ ID NOs:2577, 3183 and 3195 encode members of the 4 trans- 
membrane integral membrane protein family. This family consists of type III proteins, 
which are integral membrane proteins that contain a N-terminal membrane-anchoring 
35 domain that is not cleaved during biosynthesis, and which functions as a translocation 

w 
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signal and a membrane anchor. The proteins also have three additional transmembrane 
regions. The consensus pattern is: G-x(3)-[LIVMF]-x(2)-[GSA]-[LIVMF] (2)-G-C-x- 
[GA]-[STA]-x(20-[eG]-x(20-[CwN]-[LIVM](2). 

SEQ ID N0.2944 encodes a polypeptide having a calpain large subunit, 
5 domain III. Calpains are a family of intracellular proteases that play a variety of 
biological roles. Calpain 3, also known as p94, is predominantly expressed in skeletal 
muscle and plays a role in limb-girdle muscular dystrophy type 2A. (Sorimachi, H. et 
al., Biochem. J. 328:721-732, 1997). 

SEQ ID NOs:191 1 and 1980 encode polypeptides having a C3HC4 type 

10 zinc finger domain (RING finger), which is a cysteine-rich domain of 40 to 60 residues 
that binds two atoms of zinc, and is believed to be involved in mediating protein-protein 
interactions. Mammalian proteins of this family include V(D)J recombination 
activating protein, which activates the rearrangement of immunoglobulin and T-cell 
receptor genes; breast cancer type 1 susceptibility protein (BRCA1); bmi-1 proto- 

15 oncogene; cbl proto-oncogene; and mel-18 protein, which is expressed in a variety of 
tumor cells and is a transcriptional repressor that recognizes and binds a specific DNA 
sequence. The consensus pattern is: C-x-H-x-[LIVMFY]-C-x(2)-C-[LIVMYA]. 

SEQ ID NO:3274 encodes a eukaryotic transcription factor with a fork 
head domain, of about 100 amino acid residues. Proteins of this group are transcription 

20 factors, including mammalian transcription factors HNF-3 -alpha, -beta, and -gamma; 
interleukin-enhancer binding factor; and HTLF, which binds to a region of human T- 
cell leukemia virus long terminal repeat. The consensus pattern is [KR]-P-[PTQ]- 
[FYLVQH]-S-[FY]x(2)-[LIVM]-X(3,4)-[AC]-[LIM]. 

SEQ ID NO:3345 encodes a polypeptide having a PDZ domain. Several 

25 dozen signaling proteins belong to this group of proteins that have 80-100 residue 
repeats known as PDZ domains. Several of the proteins interact with the C-terminal 
tetrapeptide motifs X-Ser/Thr/X-Val-COO- of ion channels and/or receptors. (Ponting, 
C. P., Protein Sci. 6;464-468, 1997.) 

SEQ ID NO:3351 encodes a polypeptide in the family of phorbol 

30 esters/glycerol binding proteins. Phorbol esters (PE) are analogues of diacylglycerol 
(DAG) and potent tumor promoters. DAG activates a family of serine-threonine protein 
kinases, known as protein kinase C. The N-terminal region of protein kinase C binds 
PE and DAG, and contains one or two copies of a cysteine-rich domain of about 50 
amino acid residues. Other proteins having this domain include diacylglycerol kinase; 

35 the vav oncogene; and N-chimaerin, a brain-specific protein. The DAG/PE binding 
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domain binds two zinc ions through the six cysteines and two histidines that are 
conserved in the domain. The consensus pattern is: H-x-[LIVMFYW]-x(8, 1 l)-C-x(2)- 
C-x-(3)-[LIVMFC]-x(5, 10)-C-x(2)-C-x(4)-[HD]-x(2)-C-x(5, 9)-C. 

SEQ ID NO:2216 encodes a polypeptide having a WW/rsp5/WWP 

5 domain. The protein is named for the presence of conserved aromatic positions, 
generally tryptophan, as well as a conserved proline. Proteins having the domain 
include dystrophin, vertebrate YAP protein, and IQGAP, a human GTPase activating 
protein which acts on ras. The consensus pattern is: W-x(9,l l)-[VFY]-[FYW]-x(6,7)- 
[GSTNE]-[GSTQCR]-[FYW]-x(2)-P. 

10 seq id NO:2428 encodes a member of the dual specificity phosphatase 

family, having a catalytic domain, and SEQ IDS NOs:2281 and 2310 encode members 
of the protein tyrosine phosphatase family. These families are related and classified as 
tyrosine specific protein phosphatases. The enzymes catalyze the removal of a 
phosphate group from a tyrosine residue, and are important in the control of cell growth, 

1 5 proliferation, differentiation, and transformation. The consensus pattern is [LIVMF]-H- 
C-x(2)-G-x-(3)-[STC]-[STAGP]-x-[LIVMFY]. 
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Table 1 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATE 


4 CLONE ID 


LIBRARY. 


1 


377044 


RTA00002676F.p.l 1 .2. P.Seq 


F 


M00039329A.C0I 


CH09LNL 


2 


377708 


RTA000O2633F.m.0l.2.P.S« 


I F 


M00040039A:G08 


CH09LNL 


3 


427782 


RTA00002666F.I.06.1. P.Seq 




M00032638D:A06 


CH08LNH 


4 


29372 


RTAQOO027l2F.x06.LP.Seq 


F 


M00023282A:C02 


CH04MAL 


5 


455003 


RTA00002694F.b.02. 1 .P.Seq 


F 


M000434I9D:AIO 


CH20COHLV 


6 


380625 


RTA00002684F.d.03.2.P.Seq 


F 


M00040I18D:G10 


CH09LNL 


7 


450959 


RTA0OO0269!F.b.05.3. P.Seq 


F 


lM00043306D:B07 


CH17COHLV 


3 


397351 


RTA00002680F.b.04. 1 .P.Seq 


F 


M00039775A.-A09 


CH09LNL 


9 


20652 


RTA000027 1 OF.k.O U .P.Seq 


F 


M00022440B.E01 


CH03MAH 


10 


97830 


RTA00002663F.k. 1 S. ! .P.Seq 


F 


M00022767B:G1 1 


CH03MAH 


i: 


373071 


RTA00002670FJ.23. 1 .P.Seq 


F 


M00033442A:D06 


CH09LNL 


12 


162369- 


RTA00002713F.e.0 1.1. P.Seq 


F 


tV100027292D:F10 


CH04MAL 


13 


401247 


RTA 00002685 F . f. 1 5 .2 . P. Seq 


F 


M00039503A:CI2 


CH12EDT 


14 


430738 


RTA00002669F.L 1 5.3. P.Seq 


F 


M0003323ID:B09 


CH08LNH 


15 


46779 


RTA0OO0271 1 Fx. 14. 1 .P.Seq 


F 


M00022860C:G04 


CH03MAH 


16 


375772 


RTA0000268IF.p.OL2. P.Seq 


F 


M000399(NC:GO5 


CH09LNL 


17 


4306S9 


RTA00002669FJ.O 1 .3. P.Seq 


F 


M00033243B:A05 


CHOSLNH 


IS 


376546 


RTA0O002677F.d.07.2.P.Seq 


F 


M00039545C:C12 


CH09LNL 


19 


430041 


RTA00002667F.f. 1 7. 1 .P.Sea 


F 


M00032790B:A07 


CHOSLNH 


20 


431643 


RTA00002669F.U6.I.P.Seq 


F 


M000332*6D:H09 


CHOSLNH 


21 


19422 


RTA0OO02709F.C.02. 1 .P.Seq 


F 


M00005449B:B10 


CH02COH 


22 


376302 


RTA00002677F.c.!8.2.PSeq 


F 


M00039344B:G07 


CH09LNL 


23 


376314 


RTA00002674F.h.02.l. P.Seq 


F 


M00039(39C:GI2 


CH0°LNL 


24 


375492 


RTA000026~7F.m.I9.2.P.Seq 


F 


M000394I83:D08 


CH09LNL 


25 


379! 14 


RTA0000268 I F.n.24.2. P.Seq 


F 


M00039903C:F03 


CH0°LNL 


26 


380668 


RTA00002670F.p.l P.Seq 


F 


M0003353!C:H10 


CHO'JLNL 


27 


213817 


RTA00002664F.i.l9.2.P.Seq 


F 


M0002763-iA:OI ! 


c:-:o4\ial 


23 


375740 


RTA0O0O2680F.f.23.I.P.Seq 


F 


M000;^-95D:G06 


CHO'LNL 


29 


430396 


RTA00002669F.b.20.4. P.Seq 


F 


M00O33iS5C:DOI 


CHOSLNH 


30 


380462 


RTA00002670F.O.0 1.1. P.Seq 


F 


M00O335"OB:EO6 


CHO^LNL 


31 


430396 


RTA00002669F.b.20.3.P.Seq 


F 


M00O3313:C:D0! 


CHOSLNH 


32 


376996 . 


RTA00002676F.p. 13.2. P.Seq 


F 


M00039329C:BIO 


CH0°LNL 


33 


374846 


R7A00002077F.k.!9.2.P.Seq 


F 


M000394I2D:G06 


CH0OLNL 


34 


379075 


RTA00002672F.n. 13.2. P.Seq 


F 


M0003903^B:E03 


CH0°LNL 


35 


374172 


RTAC0002673F.k. 16.2. P.Seq 


F 


M00039097D:D06 


CH09LNL 


36 


373104 


RTA00002633F.0. 1 5.2. P.Seq 




M0004O0^SD:G!2 


CHO°LNL 


37 


136302 


RTA000027|3F.m.2 1.1. P.Seq 




M000275O|B:C04 


CHO^MAL 


38 


427947 


RTA00002665F.O.0 1.1. P.Seq 




M000324O:B:D02 


CHOSLNH 


39 


375180 


RTA00002673F.d.! 7.i. P.Seq 




M0003 4 >Ot^D:H09 


CH0°LNL 


40 


377584 


RTA00002633F.I.2*V.P.Seq 




M000400SSC.E10 


CH0°LNL 


41 


377364 


RTA0000267SF.a.l5.2.P.Seq 




moco;^4;:c-aO! 


CH0°LNL 


42 


376347 


RTA00002675F. 1.08. i. P.Sea 




M000}o:ioc;GI 1 


CH0°LNL 


43 


446747 


RTAOOO!326S°F.d.l6.2.P.Seq 




M00042740A:E09 


CHifCON 


44 


28092 


RTA00002"! ! F.'j. 12. 1 .P.Seq J F 


M0OO23032A:BO5 


CH03MAH 


45 


378206 


RTA000O26:iF.j.20.J. P.Seq 




M000335SSC:G04 


CH0°LNL 


46 


373206 


RTA00002o"IF.a.20.2. P.Seq 




M000335SSC:G04 1 


CHO°LNL 


47 


14940 


RTA0C0i:2"0^F.J.I !.! P.Se'j 


F | mcooo56:;a:Go: | cho:cok 
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SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


43 


37341 1 


RTA00002672F.g.l3.2.P.Seq 


F 


M00039004B:A06 


CH09LNL 


49 


38120 


RTA000027!2F.i.l4.l.P.Seq 


F 


M00026927D:F02 


CH04MAL 


50 


375730 


RTA00002678F.U3.2.P.Seq 


F 


M00039612B.G05 


CH09LNL 


51 


428959 


RTA00002667F.h.I5.I.P.Scq 


F 


M0003281 1B.D02 


CH08LNH 


52 


376851 


RTA00002677F.c.03.2.P.Seq 


F 


M00039341CHII 


CH09LNL 


53 


373808 


RTA0000267lF.d.l4.2.P.Seq 


F 


M00038272A:GOl 


CH09LNL 


54 


376168 


RTA00002675F.n. 17. LP.Seq 


F 


M00039253B:E06 


CH09LNL 


55 


13653 


RTA00002712F.O.08. LP.Seq 


F 


M00027135A:B1 1 


CH04MAL 


56 


187632 


RTA00002664F.U5.1.P.Seq 


F 


M000276I7B:C12 


CH04MAL 


57 


374122 


RTA00002673F.1.22. LP.Seq 


F 


M00039104D:C09 


CH09LNL 


58 


374946 


RTA00002673F.J.24. LP.Seq 


F 


M00039096A:E07 


CH09LNL 


59 


375666. 


RTA00002677F.n.l6.2.P.Seq 


F 


M00039422D:F04 


CH09LNL 


60 


162369 


RTA000027 1 3F.d.24. LP.Seq 


F 


M00027292D:F10 


CH04MAL 


61 


21480 


RTA00002709F.c.l8.2.P.Seq 


F 


MO000553ID:FO6 


CH02COH 


62 


18560 


RTA000027UF.e.20. LP.Seq 


F 


M00022933B:F07 


CH03MAH 


63 


96575 


RTA00002663F.J.0S. LP.Seq 


F 


M00022641CH05 


CH03MAH 


64 


377576 


RTA00002682F.f. 13. LP.Seq 


F 


M00039975C:C11 


CH09LNL 


65 


446747 


RTA00002639F.d.l6.3.P.Seq 


F 


M00042740A:E09 


CH15C0N 


66 


379311 


RTA00002682F.g.O 1. LP.Seq 


F 


M00039976D:A12 


CH09LNL 


67 


37931 1 


RTA00002682F.f.24. LP.Seq 


F 


M0OO39976D:A12 


CH09LNL 


68 


124549 


RTA000027 1 3F.C.07. LP.Seq 


F 


M00027237CB08 


CH04MAL 


69 


449785 


RTA00002691F.c.07.3.P.Seq 


F 


M00043345C:A06 


CH17C0HLV 


70 


375134 


RTA00002673F.k.22.2.P.5eq | F 


M00039099A:H08 


CH09LNL 


71 


186593 


RTA00002713F.n.I5.LP.Seq 


F 


M00027620D:Fl 1 


CH04MAL 


72 


44983 1 


RTA0000269lF.a.i7.3.P.Seq 


F 


M00042518D:A06 


CH17C0HLV 


73 


379678 


RTA00002676F.b.06. LP.Seq 


F 


M00039274B.G07 


CH09LNL 


74 


20599 


RTA00002708F.h.06. 1 .P.Seq 


F 


M00004264B:A05 


CHOICOH 


75 


411 15 


RTA00002713F-O.1L LP.Seq 


F 


M00027652B:F1 i 


CH04MAL 


76 


21109 


RTA00002708F.H. 12. LP.Seq 


F 


M00004278A:F09 


CHOICOH 


77 


455702 


RTA00002694F.b. 1 L LP.Seq 


F 


M00043433C:G07 


CH20COHLV 


78 


380643 


RTA00002633F.p.09.2.P.Seq 


F 


M00040103B:H10 


CH09LNL 


79 


374413 


RTA00002672F.Lt5.2.P.Seq 


F 


M000390I5B:G10 


CH09LNL 


80 


378891 


RTA00002672F.L18.2.P.Seq 


F 


M00039016A:A02 


CH09LNL 


81 


379374 


RTA00002672F.k.ll.2.P.Seq 


F 


M0003902SC:B1 1 


CHO^LNL 


82 


17253 


RTA00002709F.h.23. LP.Seq 


F 


M00006866A:D07 


CH02COH 


83 


21565 


RTA00002709F.e. 11. LP.Seq 


F 


M0000577SB:F09 


CH02COH 


84 


373996 


RTA00002673F.n.lL LP.Seq 


F 


M00039108D:B06 


CHOQLNL 


85 


380437 


RTA00002683F.C.09. LP.Seq 


F 


M00040039D:D06 


CH09LNL 


86 


430729 


RTA00002669F.h.l3.:.P.Seq 


F 


M00033226A:Al I 


CHOSLNH 


87 


376791 


RTA00002674F.LI7.LP.Seq 


F 


MOOCb^ Iood.uUo 




88 


373760 


RTAO0002672F.p.2O. LP.Seq 


F 


M00039049D:G07 


CH09LNL 


89 


373837 


RTA00002672F.p.22. LP.Seq 


F 


M00039050A:H10 


CH09LNL 


90 


376435 


RTA0000267SF.h.l7.2.P.Seq 


F 


M00039476B:A02 


CH09LNL 


91 


373331 


RTA00OO2672F.b.2O. LP.Seq 


F 


MO0O3863SD:HO3 


CH09LNL 


92 


377086 


RTA00002676F.p.O7. 1 .P.Seq 


F 


M00039323D-.D07 


CHO^LNL 


93 


377839 


RTA00002672F.C.08. LP.Seq 


F 


M00038661A:A07 


CH09LNL 


94 


380442 


RTA00002684F.b.05.2. P.Seq 


F 


M00040I11C:D05 


CHO°LNL 



no 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORJENTATIOh 


i CLONE ID 


LIBRARY 


95 


374689 


RTA00002676F.m.i3.2.P.Sec 


I F 


M000393I3B:B09 


CH09LNL 


96 


375339 


RTA00002678F.m.23.2.P.Sec 


1 F 


M00039616A:BIO 


CH09LNL 


97 


14197 


RTA000027IOF.f. 15.1. P.Seq 


F 


M00022084D:B01 


CH03MAH 


98 


380666 


RTA00002684F.c.04.2.P.Seq 


F 


M00O4OII5B:HI2 


CH09LNL 


99 


377352 


RTA00002677F.1. 1 3.2. P.Seq 


F 


M00039404B:A05 


CH09LNL 


100 


379188 


RTA00002682F.a.03. 1 .P.Seq 


F 


M00039914D:G12 


CH09LNL 


101 


428269 


RTA00002666F.c.l3.l.P.Seq 


F 


M00032539B:Cll 


CH08LNH 


102 


373464 


RTA00002671F.U 3.3. P.Seq 


F 


M00038327A:Cil 


CH09LNL 


103 


15527 


RTA000027 1 0F.p.07. 1 .P.Seq 


F 


M00022747D:E03 


CH03MAH 


104 


377504 


RTA0000267 1 F.i. 17.3. P.Seq 


F 


M00038303C:D02 


CH09LNL 


105 


33508 


RTA00002710F.g.I7.1.P.Seq 


F 


M00022183B:C02 


CH03MAH 


106 


129179 


RTA00002662F.d.l9.2.P.Seq 


F 


M00007I57C.F1I 


CH02COH 


107 


377086 


RTA00002676F.p.07.2.P.Seq 


F 


M00039328D:D07 


CH09LNL 


108 


375872 


RTA00002675F.h.l5.I.P.Seq 


F 


M00039233A:A03 


CH09LNL 


109 


375652 


RTA00002676F.i:07.3.P.Seq 


F 


M0O0393O3CFII 


CH09LNL 


110 


374266 


RTA00002674F.i.08.2.P.Seq 


F 


M00039144CE06 


CH09LNL 


(11 


378983 


RTA00002682F.a.07.l. P.Seq 


F 


M000399I5D:CII 


CH09LNL 


112 


377343 


RTA00002684F.g.04. i .P.Seq 


F 


M00040302C:A04 


CH09LNL 


113 


378679 


RTA00002681F.r'.16.2.P.Seq 


F 


M00039869B:F06 


CH09LNL 


114 


374095 


RTA0000267lF.p.08.2.P.Seq 


F 


M000386I8C:C08 


CH09LNL 


115 


375843 


RTA0000267IF.o.06.2,P.Seq 


F 


M000386I4C:HM 


CHOUNL 


116 


377788 


RTA00002684F.h.OI.2.P.Seq 


F 


M00040305C:H06 


CH09LNL 


117 


21403 


RTA00O02709F.j.05.1. P.Seq 


F 


M0000692SD:D07 


CH02COH 


1 IS 


23184 


RTA00002709F.b.05.2. P.Seq 


F 


M000O5358B:BO6 


CH02COH 


119 


15671 


RTA000027 1 OF.k. 16.1. P.Seq 


F 


M00022495D:H08 


CH03MAH 


120 


177367 


RTA00002663F.m.22.l. P.Seq 


F 


M00022986D:H09 


CH03MAH 


121 


377788 


RTA0O0O2684F.g.24.l. P.Seq 


F 


M00040305C.H06 


CH09LNL 


122 


375058 


RTA00002675F.h.02.l. P.Seq 


F 


lV100039230D:G12 


CH09LNL 


123 


380412 


RTA000O26S0F.k.l5.2.P.Seq 


F 


M000398I6B:D04 


CH09LNL 


124 


178447 


RTA00002663F.n.06.l. P.Seq 


F 


M00023007A:H04 


CH03MAH 


125 


376647 


RTA00002674F.h.07.l. P.Seq 


F 


M00039140D:D09 


CH09LNL 


126 


44679 


RTA0000266IF.e. 19.1. P.Seq 


F 


M00003800A:F09 


CH0ICOH 


127 


377659 


RTA00002678F.a.04.2.P.Seq 


F 


M0003*430B:F12 


CHOQLNL . * 


128 


379703 


RTA00002682F.h.03.i. P.Seq 


F 


M00039932C:H04 


CH09LNL 


129 


374673 


RTA00002673F.e.08.2.P.Seq 


F 


M00039063B:B04 


CH0 Q LNL 


130 


20513 


RTA00002710F.J. 12.1. P.Seq 


F 


M0002239ID:FIO 


CH03MAH 


131 


376124 


RTA00002682F.n.l7.I.P.Seq 


F 


M0004002IA:F09 


CH09LNL 


132 


374679 


RTA00002676F.d.07.2.P.Seq 


F 


M00039281D:B04 


CKO^LNL 


133 


23134 


RTA00002709F.b.05.l. P.Seq 


F 


M00005358B:306 


CH02COH 


1 1 1 

1 J4 


4jQ*Oj 


RTA00(j0266SF.i.2j. 1 .P.Seq 


F 


M00033OO7CE01 


CHOSLNH 


135 


380442 


RTA00002684F.b.05. 1 .P.Seq 


F 


M000401 1 1C:D05 


CK09LNL 


136 


12374 


RTA00002709F.a.0 1.1. P.Seq 


F 


M00004S25D:D05 


CH02COH 


137 


427466 


RTA00002665F.b.l 1.3. P.Seq 


F 


M00028184D:G10 


CHOSLNH 


138 


3661 1 


RTA000O2668F.f.03.1. P.Seq 


F 


M00032942D:CI2 


CHOSLNH 


139 


33756 


RTAO0OO2662F.a.l 8.2. P.Seq 


F 


M00005359A:D0.4 


CH02COH 


140 


456026 


RTA00OO2694F.e.03. 1 .P.Seq 


F 


M00043616C:A05 


CH20COHLV 


141 


15766 


RTA000027IOF.k.02.l. P.Seq 


F 


M00022444D:G0I 


CH03MAH 



1l 



WO 01/02568 



PCT/US00/18374 



SEO 
ID 


CLUSTER 


CCA \! A ViP 




CLONE ID 
MO0O04839C:H02 


LIBRARY 
CH02COH 


142 
143 


24352 
24354 


RTA00002709F.a.0? . 1 .P.Seq 
RTA00002709F.a.O-v 1 .P.beq 


1 


M00004832D:H02 


CH02COH 


144 


379114 


RTA0000268 1 F.o.O 1 .2.P .beq 


= 




M0003990' ; OF03 


CH09LNL 


145 


19609 


RTA00002709F.C.05. 1 .P.Seq 




MOOnOi4^7C'A03 


CH02COH 


146 


21685 


RTA00002709F.e.2j. 1 .P.beq 


F 


M00006581 D:F08 


CH02COH 


147 


380085 


RTA00002682F.L 10. 1 .P.beq 


F 


M00039987A:F09 


CH09LNL 


148 


20700 


RTA00002710F.I. lo.l.P.beq 


F 


M000* n 373A:B05 


CH03MAH 


149 


379981 


RTA00002682F.1. 1 8. 1 r.beq 


F 


M0003998SA:E06 


CH09LNL 


150 


376591 


RTAO0OO2675F.C.0 1 . 1 .P.Seq 


F 


M00039*U3 A'DOl 


CH09LNL 


151 


92058 


RTA0000266jF.m.04. 1 .P.beq 


= 


M0O0' } ' , 89^ AH08 


CH03MAH 


152 


196936 


RTA00002663F.m.02. t P.Seq 


F 




CH03MAH 


153 


430702 


RTA00002668F.h.04. 1 .P.Seq 


' c 


Mn0f)"PQ<50B' A. 1 1 


CH08LNH 


154 


378448 


RTA00002680F.n.2 1.2. P.Seq' 


F 


MflOO }98"P VB 12 


CH09LNL 


155 


41606 


RTA00002713F,e. 10.1. P.Seq 


F 


M0002730IA:G05 


CH04MAL 


156 


213817 


RTA00002664F.L1 9.1. P.Seq 




M00027634A:DI1 


CH04MAL 


157 


373464 


RTA00002671F.1. 13.1. P.Seq 


F 


M0003S327A:CI1 


CH09LNL 


158 


379483 


RTA00002679F.k. 1 -. 1 .P.oeq 




M00039700B:D02 


CH09LNL 


159 


375796 


RTA00002680F.f. 1 7. 1 .P.Seq 




M000397953:H10 


CH09LNL 


160 


375796 


RTA00002630F.f.l7.2.P.Seq 


F 


M00039795B:H10 


CH09LNL 


161 


120485 


RTA00002663F.b. 12.1. P.Seq 




IVIUVJU U.I 1 — 


CH03MAH 


162 


374291 


RTA00002673F.f. 17.!. P.Seq 


^ 


M00039072CE02 
M00039428C:E01 


CH09LNL 
CH09LNL 


163 
164 
165 


380513 
379416 
378178 


RTA00002677F.p.l:-2.P.Seq 
RTA00002683F.J.07.2. P.Seq 
RTA00002680F.1. 1 3. 1 .P.Seq 


F - 


M0004007"D:Cl 1 
M i)00 1 98^0 A. F1 1 


CH09LNL 
CH09LNL 


166 
167 
168 


427947 
427269 
20451 


RTA00002665 F.n.24.1. P.Seq 
RTA00002665F.d.03 .3. P.Seq 
RTA00002710FJ- 10.1. P.Seq 


F 


M00032495B:D02 
M00028212C:B08 
M0002239!3:E01 


CH08LNH 
CH08LNH 
CH03MAH 


169 
170 


377003 
427759 


RTA00002683F.g.O^.:. P.Seq 
RTA00002665F.0. 1 U .P.Seq 




M00040062B:305 
M00032499CA01 
M0003303-KT:A06 


CH09LNL 
CH08LNH 
CH03LNH 


171 
172 


427549 
373881 


RTA00002668F.k.t31. P.Seq 
RTA00002672F.b.20.2.P.Seq 




M00038638D.H03 


CH09LNL 


173 
174 


188215 
379683 


RTA00002664FT. 13.2. P.Seq 
RTA0000268IF.d.04.:. P.Seq 




M00027200A:F02 
M00039357B:G10 


CH04MAL 
CH09LNL 


175 
176 
177 
178 
179 


380652 
378334 
377930 
378692 
32279 


RTA00002678F.d.lZ.:. P.Seq 
RTA00002679F.h.I0.l.P.Seq 
RTA00002680F.g.t4.l.P.Seq 
RTA00002680F.o.-U.j.r.beq 
RTA00002709F.d.23. 1 P.Seq 


■ F 


M00039455D.H04 
M00039682C:Hll 
M00039793B:B02 
M0003983:A:F07 
M00005673B:B12 
M00039782A.-H10 


CH09LNL 
CH09LNL 
CH09LNL 
CH09LNL 
CH02COH 
CH09LNL 


180 

1 O 1 

182 
183 
134 
185 


376379 
375963 
378683 
374946 
429583 
28333 


RTA00002680F.C. 15.1. P.Seq 
RTA00002675F.i.i:.l.P.Seq 
RTA00002680F.a. 14.2. P.Seq 
RTA00002673F.J.24.:. P.Seq 
RTA00002666F.2. 10.1. P.Seq 
RTA00002711F.e.r.l. P.Seq 




M00039238A:B12 
M00039773D:A09 
M00039096A:E07 
M00032584A:H08 
M00022930C.E02 


CH09LNL 
CHO^LNL 
CH0QLNL 
CH08LNH 
CH03MAH 


136 
187 
188 


427970 
379650 
379661 


RTA00002665F.j.lJ.l.P.5eq 
-RTA00002683F.h^2.:. P.Seq 
RTA00002676F.c.0:.:.P.Sea 




M00031368A:EIO 
M00040072CG09 
M00039277D:GiO 


CH0SLNH 
CH09LNL 

cho^lnl"! 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


189 


376182 


RTA00002677F.b.l7.2.P.Seq 


F 


M00039340B:E07 


CH09LNL 


190 


374797 


RTA00002678F.bJ2.2-P.Seq 


F 


M00039444CH02 


■ CH09LNL 


191 


375389 


RTA00002674F.a.!3.LP.Seq 


F 


M00039120CC09 


CH09LNL 


192 


397115 


RTA00002683F.i.22.2,P.Seq 


F 


M00040076CD06 


CH09LNL 


193 


186655 


RTA00002712F.i,2 1.1. P.Seq 


F 


M00026941D:A04 


CH04MAL 


194 


404682 


RTA00002687F.b.l3.LP.Seq 


F 


M00039766D:H0I 


CH14EDT 


195 


19609 


RTA00002709F.c.05.2.P.Seq 


F 


M00005457CA03 


CH02COH 


196 


404682 


RTA00002687F.b.l3.2.P.Seq 


F 


M00039766D:H01 


CH14EDT 


197 


380412 


RTA00002680F.U 5. 1 .P.Seq 


F 


M00039816B:D04 


CH09LNL 


198 


394413 


RTA00002689F.d. 1 7.3. P.Seq 


F 


M00042742D:D05 


CH15CON 


199 


380086 


RTA00002679F.m. 16. 1 .P.Seq 


F 


M00039710CG03 


CH09LNL 


200 


430738 


RTA00002669F.L15.2.P.Seq 


F 


M00033231D:B09 


CH08LNH 


201 


40667 


RTA000027 1 2F.g.22. 1 .P.Seq 


F 


M00026882D:G09 


CH04MAL 


202 


397421 


RTA0000268IF.c.l6.2.P.Seq 


F 


M00039854B:F09 


CH09LNL 


203 


398775 


RTA00002679F.f. 11.1. P.Seq 


F 


M00039675D:H05 


CH09LNL 


204 


87345 


RTA000027l2F.f.l9.I.P.Seq 


F 


M00026850D:F09 


CH04MAL 


205 


379920 


RTA00002679F.b.24.2, P.Seq 


F 


M00039660CCI0 


CH09LNL 


206 


380666 


RTA00002684F.C.04. 1 .P.Seq 


F 


M000401 15B:H12 


CH09LNL 


207 


404340 


RTA00002687F.b.05.2. P.Seq 


F 


M00039764CD07 


CH14EDT 


208 


375509 


RTA00002680F.e.08.:. P.Seq 


F 


M00039790B:D03 


CH09LNL 


209 


46423 


RTA000027I2F.i.02.l.P.Seq 


F 


M000269l4A:H'lO 


CH04MAL 


210 


401713 


RTA00002685F.p. 10.2. P.Seq 


F 


M00039647A:H1 1 


CH12EDT 


211 


377003 


RTA000026S3F.g.09. 1 P.Seq 


F 


M00040062B:B05 


CH09LNL 


212 


378891 


RTA00002672F.L 18.1. P.Seq 


F 


M00039016A:A02 


CH09LNL 


213 


412778 


RTA00002685F.L07.2. P.Seq 


F 


M00039533D:F04 


CH12EDT 


214 


373786 


RTA00002679F.a.20.2.P.Seq 


F 


M00039655CC07 


CH09LNL 


215 


378692 


RTA00002680F.O.20.2. P.Seq 


F 


M00039835A:F07 


CH09LNL 


216 


8S88S 


RTA00002713F.f.22.1. P.Seq 


F 


M00027355A:BO7 


CH04MAL 


217 


358187 


RTA00002676F.b.04.2. P.Seq 


F 


M00039273D:B02 


CH09LNL 


218 


377131 


RTA00002682F.e.IO.I.P.Seq 


F 


M00039938C:Ell 


CH09LNL 


219 


21488 


RTA00002708F.f. 17.1. P.Seq 


F 


M00004152A:C12 


CH01COH 


220 


447487 


RTA00002689F.e.04.3. P.Seq 


F 


M00042S95A:D10 


CH15CON 


221 


364 


RTA000027lOF.a.06.1.P.Seq 


F 


M00007929C:B08 


CH03MAH 


222 


404024 


RTA00002687F.e.l8.2.P.Seq 


F 


M0O039958A:A08 


CHI4EDT 


223 


152305 


RTA000027 1 2F.d.02. 1 .P.Seq 


F 


M00023376B:G04 


■CH04MAL 


224 


106050 


RTA000027 1 3 F.o. 1 7. LP.Seq 


F 


M00027668C:H12 


CH04MAL 


225 


41126 


RTA00002713F.I.I2. 1. P.Seq 


F 


M00027546C:BIO 


CH04MAL 


226 


113496 


RTA000027l3F.n.20.l. P.Seq 


F 


M00027625A:H0l 


CH04MAL 


227 


447487 


RTA00002689F.e.04. 1 .P.Seq 


F 


M00042895A:D10 


CH15CON 


228 


146335 


RTA000027 1 2 F.j. 1 7. 1 . P.Seq 


F 


M00026980A:D09 


CH04MAL 


229 


376647 


RTA00002674F.h.07.2. P.Seq 


F 


M00039I40D:D09 


CH09LNL 


230 


376746 


RTA00002674F.f. 12.2. P.Seq 


F 


M00039I33B:F08 


CH09LNL 


231 


373523 


RTA00002674F.n.2 1.2. P.Seq 


F 


M00039I77B:D03 


CH09LNL 


232 


455466 


RTA00002694F.C. 10.1. P.Seq 


F 


M00043461D:E06 


CHZOCOHLV 


233 


374031 


RTA000026S3F.p.. 17.2. P.Seq 


F 


M00040105C:FM 


CH09LNL 


234 


373997 


*RTA00002673F.m.04.2. P.Seq 


F 


M00039105CB08 


CH09LNL 


235 


4557(7 


RTA00002694F.a.06.l. P.Seq 


F 


M00042593C:G06 


CH20COHLV 



■?3 



WO 01/02568 



PCTAJS00/18374 




WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


. . — 

<;fo name 


ORIENTATION 


CLONE ID 


LIBRARY 


283 


380500 


RTA00002o/Ur.p. 1 v.—r.aeq 


r 


M0OO33583B:E06 


CH09LNL 


284 


34928 


RTA00UO- / 1 Ur .p.- i.i.r .ocq 


F 

r 


M00022795B:G06 


CH03MAH 


285 


374028 


RTAOOOOio /4r .K.Uj.-i.r.^eq 


F 

r 


M00039 1 56 A: Bl 1 


CH09LNL 


286 


374121 


RTA000026 flr.n — l.^.r.ieq 


F 

r 


M000390t3A:C09 


CH09LNL 


287 


429547 


RTAOOOU-OOor.c.u / . l.r.oeq 


F 


M000329J7D-.G09 


CH08LNH 


288 


380668 




F 


M0003358IC:H10 


CH09LNL 


289 


258704 


n -r * AAAA")££<C m fi£ 1 P ^r>n 

RTA0QUU-iO00r.rn.uo. i .r.ocq 


F 


M00032480B:E!0 


CH08LNH 


290 


380325 




F 


M00033533D:B05 


CH09LNL 


291 


378326 


n-r a AAfiniAQ 1 C m II TP ^*»n 

RTAOOUU-ioa l r .m. i i.-i.r.oeq 


F 


M00039896C:H0i 


CH09LNL 




375618 


RTAOOOU-0 Or. a. 1 j.i.r.aeq 


F 


M00039218A:F03 


CH09LNL 


293 


20999 


RTAOOUU- /UVr.J. 10. i.r .jcq 


F 


M00006977C:G04 


CH02COH 


294 


29102 


RTA00OQ27 1 Ur .p. 1 3. I .r.oeq 


F 

r 


M00022793D:B01 


CH03MAH 


295 


379334 


RTAOOOUiooUr .0.--. I .r.oeq 


F 


M00039778C:A04 


CH09LNL 


296 


23943 


RTA00002 /09P.1. 1 — l.r.beq 


c 

r 


M00006836D:H02 


CH02COH 


297 


373998 


RTA00002o/ir.a. lU.J.r.seq 


F 

r 


M0003863 1 D:B02 


CH09LNL 


298 


373325 


RTA000026/-r.c.l4..;.r.3eq 


c 

r 


M00038662B:A12 


CH09LNL 


299 


373818 


RTA00002o/Jr.e. l D.-i.r.aeq 


F 


M00038995C:G03 


CH09LNL 


300 


429843 


RTAOQUU-OOor .c. i U. l .r.jcq 


F 


M00032913B:E06 


CH08LNH 


301 


427755 


r» -r* 4 nnAAi £ £ < V A IOTP Q»»n 

RTAOOOOiOOJr .u. I V.j.r.jcq 


F 


M000283 16B:H12 


CH08LNH 


302 


189177 


RTA00OO2664r .c._j..i.r.5eq 


F 

r 


M000"'69' ? * ? C:G03 


CH04MAL 


303 


13294 


i-* t> t nAAAHAAC i 1 { 1 DC j#i 


F 


M00006963A:G08 


CH02COH 


304 


178801 


RTAOOuUiOOJ r .n.U l . I .r.ocq 


F 


M00022997A:F06 


CH03MAH 


305 


230865 


RTA00002664F.d.Oj.J.r.beq 


F 
r 


M000 7 69 n 3DA03 


CH04MAL 


306 


178801 


RTA00OO266jr.m._4. t .r.seq 


c 

r 


MOOO^^^^ / A F06 


CH03MAH 


307 


378809 


RTA000026 /2r .g.i 1 .-.r.oeq 


F 

r 


M00039005C:H01 


CH09LNL 


308 


378957 


RTA000U-O /Ur .a. l / .•.r.ocq 


F 

r 


M00033362CC05 


CH09LNL 


309 


373523 


r*-r \ AAnni/i7ir n ^ 1 IP ^r»n 

RTA00OUio74r .n.- 1 . 1 .r . ^c_q_ 


F 


M00039177B:D03 


CH09LNL 


310 


375458 


RTAOOOu-O /or l.UO.-.r.oeq 


F 


M0003961 1 D:D1 1 


CH09LNL 


311 


429794 




F 


M000329I8B:D08 


CH08LNH 


312 


72797 


n-T* * AAAn"l££ 1 C a A "7 1 P Qjh 

RTAOOOUioo l r .e.u / . 1 .r.occ 


F 


M00003?61C:F02 


CH0ICOH 


313 


429992 


RTAUUUU-OOor.C.- 1- 1 .r.ocq 


F 


M00032921B:H08 


CH08LNH 


314 


374410 


ot \ AAnm If 1 1 "* P Sen 
RTAUUUU-0 /4r .K. i i.-.r.ocL 


F 


M00039I53B:GI2 


CH09LNL 


315 


376553 


RTAOOOUio /4r .g. I v. l .r.ocq 


F 


M00039139A:C09 


CH09LNL 


316 


429096 


T\T * AAAA")^££ C f l£ 1 P C^n 

RTA000O2ooor.r. to. I .r.ocq 


F 

r 


M00032573A:G06 


CH08LNH 


317 


181948 


RTAOOOU-oojr.n.uj. i .r.occ 


F 


M00023003C:D07 


CH03MAH 


318 


378475 


RTAU00Uj:O i -r.n.U I ._.r. ocq 


F 


M00039006D:B01 


CH09LNL 


319 


427336 


r»-r * AAnri") A£ r /- *? * 1 P S*»n 
R T AOUUU-OOjt .c.-j. i .r.oct, 


F 


M000:S2t0B:DO2 


CH08LNH 


320 


374042 


R T AUUUUJo /-r .a.Uo.-.r. jcq 


F 


M0003So3lC:BlG 


CH09LNL 


321 


386543 


ot \ AAAA-]i;7->p r t ; t p c en 


F 


M00038°99B:Gll 


CH09LNL 


322 


376659 


RTA00002678F.h.lI.2.P.Seq 


F 


MOOO.'^TSCiElO 


CH09LNL 


323 


29135 


RTAOOOO2663F.c.09.1.P.Seq 


F 


M00021°23C:D11 


CH03MAH 


324 


377967 


RTA0000268lF.m.l7.2.P.Seq 




M00039S^7D:C10 


CH09LNL 


325 


431330 


RTA0000266SF.m.lb.2.P.Seq 


F 


M0003JO~4A:COS 


CH08LNH 


326 


373824 


RTA00002680F.i.l9.2.P.Seq 


F 


M0003^SOSD:H02 


CH09LNL 


327 


50094 


RTA0000266lFj.O:.2.P.Seq 


F 


M000043 7SA:Bl.O 


CH01COH 


328 


214272 


jlTA00002664F.h.03.2.P.Seq 


F 


M000:7366A:FII 


CH04MAL 


329 


377293 


RTA00002680F.b.l7.2.P.Seq 


F 


M0003^""7C:E05 


CH09LNL 



WO 01/02568 
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SEQ 
ID 
330 
331 



332 
-t -* 



334 



341 



342 



347 
343 



355 
356 
357 



360 



361 



362 



363 



CLUSTER 
195053 
21274 



364 



36: 



366 



367 



368 



369 



370 



37: 



376 



SEQ NAME 
~RTA00002665 h .n. 1 0. l.P.Seq 
rRTA0OOQ27O9F.m.Q9. 1 .P.Se"g 



376580 
374725 



RTA00OQ2675F.b.20.1. P.Seq 
RTA00002673F.r.Q2.2.P.Seq 



"25238 I RTA000027IOF.n.08.I.P.Seq 



ORIENTATION 



377337 | RTA00002683F.l.Q7.2.P.Seq 

450485 1 RTA00002692F.a.l3.2.P.Seq 

21989 RTA00002709F.h.22.1.P.Seq 

58296 1 RTA0000266 1 F.i.20.2.P.Seg 

379144 RTA00002679F.1. 14.1. P.Seq 

V79690 1 RTA0000268 0F,b.21.2.P.Seq 



379640 



RTA00002681F.d.l2.2.P.Seq 



'173988 1 RTA00002673F.h.23.KP.Seq 



373983 I RTA00002673F.h.23.2.P.Seq 
180673 1 RTA00Q 02673F.j.l3.2.P.Seq 



S51 43 I RTA00002661F.i.06.2.P.Seq 
15§57 1 RTA0000 2713F.h.2I.I.P.Seq 



375467 | RTA00002677F.m.Q3. 1 .P.Seq 



"398406 F RTA00002679Fj.02. 1. P.Seq 



430392 
376746 
115595 
377182 



RTA0Q00266SF.k. 19. 1 .P.Seq 
RTAQ000:674F.f. 12.1. P.Seq 
RTA000027 1 3F.e.Q7. 1 .P.Seq 
RTA00002682F.L1 LlP-Seq" 



380659 I RTA00002634F.e.07.2.P.Seq 
373862 1 RTA000026 71F.g.0 1.1. P.Seq 



376096 I RTA00002677F.b. 1 6.2.P.Seq 
372887 1 RTA00002670F.d.0S.2.P.Seq 
178475 1 RTA00OO2672 F.g.24.2.P.Seq 



427336 1 RTA000O2665F.c.23.3.P.Seq 
373814 I RTA0000 2672F.b.02.2.P.Seq 



215506 1 RTA00002664F.h.08.2. P.Seq 



374465 rRTA00QO2673F.c.07.2.P.Seq 



428784 . I RTA00002667FX. 18.1. P.Seq 



179S81 RTA00002676F.a.21.2.P.Seq 



"378371 I RTA000O:678F7f.20.2.P.Seq 



375154 1 R TA00002676F.c.l3.2.P.Seq 



431214 RTA00002669F.k.04.LP.Seq 



376053 | RTA0O0O2675F.l.O3.I.P.Seq 



373282 1 RTA 00002680F.j.l9.2.P.Seq 



3397 1 RT AO00O2661F.h.04,l.P.Seq 

1. P.Seq 



376706 



373292 
431612 



RTA000026T5 F.C.Q2 

i.09 



373471 



RTA00002631F.i.Q9.2.P.Seq 
RTA00002669F.e.23.3.P.Sea 
RTA00002679F.Q. 1 7- 1 -P.Seq 



373666 
374894 



RTA00002631F.i.05.2.PSej 
RTA00002675F.f.04. 1 P.Seq 



430191 | RTAQ000:667F.j.24.1.PSeq 



-7V 



CLONE ID 
M00023Q44B r D02_ 
M00007I94A.B09" 



M00039212C:C12 
M00039Q70D:C02 



M00022634D:C08 



M00040085D:A10 
M00042625C:B04 
M00006861B:FO9 
M00004354D:E05 
M00039705D:F02 
M00039773B:G03 



M00039359C:G10 



M00039079A:A05 



M00039079A:A05 
M00039084C:H03 



M0000423:D:Cn 
M000:"398C:FO7 



M00039417A:D03 
M00039639C:E08 



MQ0033Q37D:C11 
MOOOr^^jBtFOS 
iM00Q:729:A:C04 
M0O040O10A:F10 



MQ0Q40124D:H0l 
M00033284B:H04 



M00Q3934OA:D0 

M0003333SA:H12 

M00039006D:B0l 



M00028210B:D02 
M00033635A:G09 



CH09LNL 
CH09LNL 



CH03MAH 
CH09LNL 
CHI SCON 
CH02COH 
CH01COH 
CH09LNL 
CH09LNL 



CH09LNL 



CH09LNL 



CH09LNL 
CH09LNL 



CH01COH 
CH04MAL 



CH09LNL 



CH09LNL 



CH08LNH 
CH09LNL 
CHQ4MAL 
CH09LNL 



CH09LNL 
CH09LNL 



CH09LNL 
CH09LNL 
CH09LNL 



M00027438C:G07 
M00039O53C:H02 



M00032744B:F10 



M00039273B:F02 



M00039465A:A08 



M00039279B:H02 



M0003326:D:A11 



M00039249A:C12 



M00039813B:P11 



M00004163A:G11 



M0OO392UB:FO5 



M00039S80A:H11 
M00033:0ZD:G06 



M0003972"C:309 



M0Q039379C:F0j 
M00039224A:E12 



M00032829B:E06 



CHOSLNH 
CH09LNL 



CH04MAL 



CH09LNL 



CH09LNL 



CH09LNL 
CH09LNL 



CHOSLNH 



CH09LNL 



CH09LNL 



CHOICOH 



CH09LNL 
CH09LNL 



CHOSLNH 



CH09LNL 
CH09LNL 



CH09LNL 
CHOSLNH 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


377 


42S58I 


RTA00002667F.c.l2.1.P.Seq 


F 


M00032739A:A06 


CH08LNH 


378 


379598 


RTA00002679F.k.03. LP.Seq 


F 


M00039697B:Fl 1 


CH09LNL 


379 


45300 


RTA000027I0FJ.23. LP.Seq 


F 


M00022434D:DO6 


CH03MAH 


380 


23030 


RTA00002709F.b. 10. LP.Seq 


F 


iM00005384A:Ci 1 


CH02COH 


381 


379928 


RTA00002679F.O.06. LP.Seq 


F 


M00039720D:DO2 


CH09LNL 


382 


430191 


RTA00002667F.k.O 1 . 1 .P.Seq 


F 


M00032329B:E06 


CH08LNH 


383 


374684 


RTA00002675F.2.02. 1 .P.Seq 


F 


M00039228A:B05 


CH09LNL 


384 


375728 


RTA00002676F.h.05.2.P.Seq 


F 


M00039299B:G12 


CH09LNL 


385 


230237 


RTA00002670F.b.08.2.P.Seq 


F 


M00033306D:H09 


CH09LNL 


386 


380673 


RTA00002673Fj.l3.LP.Seq 


F 


M00039O84CHO3 


CH09LNL 


387 


378933 


RTA00002679F.k.20. 1 .P.Seq 


F 


M00039702.V.B12 


CH09LNL 


383 


375115 . 


RTA00002673F.e.O 1 . 1 .P.Seq 


F 


M00039066D:GO8 


CH09LNL 


389 


378673 


RTA00002680F.p.2 1 .2.P.Seq 


F 


M00039833A:F05 


CH09LNL 


390 


372909 


RTA00002670F.a. 12.2, P.Seq 


F 


M00033300D:HI2 


CH09LNL 


391 


373300 


RTA00002674F.C.2 1.1. P.Seq 


F 


M00039126D:A08 


CH09LNL 


392 


379318 


RTA000026S3F.h.l6.2.P.Seq 


F 


M00040071B:A10 


CH09LNL 


393 


378319 


RTA000026S 1 F.k.07.2. P.Seq 


F 


M00039890A:H05 


CH09LNL 


394 


374608 


RTA00002675F.g.20. 1 .P.Seq 


F 


M00O3923OA:A10 


CH09LNL 


395 


374323 


RTA00002673F.C.24.2. P.Seq 


F 


M00039061B:F08 


CH09LNL 


396 


374328 


RTA00002673F.d.0l.2.P.Seq 


F 


M00039061B:F08 


CH09LNL 


397 


42S401 


RTA00002667F.b.07. 1 .P.Seq 


F 


M00032725CF06 


CHOSLNH 


398 


136202 


RTA0OOO2687F.p.05.2.P.Seq 


F 


M00040349D:B09 


CHI4EDT 


399 


374394 


RTA00002673F.C.I 5.1. P.Seq 


F 


M00039059C:G08 


CH09LNL 


400 


37784 


RTA00002708F.C. 1 7. 1 .P.Seq 


F 


M000038I6D:E11 


CH01COH 


401 


378232 


RTA0000268 1 F.h. 11.1. P.Seq 


F 


M00039876D:H09 


CHO^LNL 


402 


185663 


RTA00002712F.p. 17.2. P.Seq 


F 


M000:7173B:G09 


CH04MAL 


403 


14866 


RTA00002709F.d. 14.1. P.Seq 


F 


M00005623D:G12 


CH02COH 


404 


383502 


RTA00002670F.k.07.2.P.Seq 


F 


M00033446D:B02 


CH09LNL 


405 


13463 


RTA00002709F.f. 18. LP.Seq 


F 


MO0006657CGO5 


CH02COH 


406 


21274 


RTA00002709F.m. 09.2. P.Seq 


F 


M00007194A:B09 


CH02COH 


407 


13745 


RTA00002714F.b. 13. LP.Seq 


F 


M00027801C:C1 1 


CH04MAL 


408 


23485 


RTA000027 1 4F.C. 1 0. LP.Seq 


F 


MOOO:?836D:FI2 


CH04MAL 


409 


. 428364 


RTA00002667F.C.09. 1 .P.Seq 


F 


M00O32737B:EO9 


CHOSLNH 


410 


431629 


RTA00002669F.1. 14.2. P.Seq 


F 


MOO033276B:GO8 


CHOSLNH 


411 


379754 


RTA00002682F.h.OS. 1 .P.Seq 


F 


M00039983D:A06 


CH09LNL 


412 


431601 


RTA00002669F.k.08.2.P.Seq 


F 


M00033263B:G04 


CHOSLNH 


413 


375749 


RTA00002680F.f.23.2.P.Seq 


F 


M0003°795D:G06 


CHO^LNL 


414 


373764 


RTA0000268 1 F.j.04.2.P.Seq 


F 


M0003 C >884A:HU 


CH09LNL 


415 


215605 


RTA00002664F.L20. LP.Seq 


F 


M0002"647C:D03 


CH04MAL 


416 


376144 


RTA00002675F.J.09. 1 .P.Seq 




\jnr»n *G"*_i i a tr l l 

IVIUI/'J.* 1 .-\ . — I l 


CHO^LNL 


417 


373071 


RTA00002670FJ.23.2. P.Seq 




M00033442A-.DO6 


CH09LNL 


418 


379684 


RTAO000Z681F.C.09.2. P.Seq 




M0003°351B:GI 1 


CH09LNL 


' 419 


379610 


RTAOO0O:630F.k.t 1.2. P.Seq 




M0003°81 5C :F09 


CHO^LNL 


420 


22392 


RTA0000270SF.a. 10. LP.Seq 




MOO00!395D:HO2 


CH01COH 


421 


377555 


RTA00002683F.1.08.2.P.Seq 




M00040085D:E04 


CH09LNL 


422 


32624 


RTA000027l3F.f. 15. LP.Seq 




M0002"347C:G07 


CH04MAL 


423 


375024 


RTA000026~5F.p.l2. LP.Seq 




M0003O266D-F12 


CHO°LNL 



i7 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 
424 


CLUSTER 
374725 " 


SEQ NAME ( 
RTA00002673F.t'.02.l.P.Seq " 


DR1ENTATIQN 


CLQNc iU 
M00039070D:C02 


I [RRARY 
CH09LNL 


425 
426 
427 
428 
429 


376223 
375906 
186190 
57694 
7007 


RTA00002676F.f.l9.2.P.Seq 
RTA00002675F1 18.1. P.Seq 
RTA000027 1 4F.a.04. 1 .P.Seq 
RTA00002713F.f.02.l. P.Seq 
RTA00002709F.d.08. 1 .P.Seq 




M00039293A:H04 
M00039238D:A08 
M00027729D:H06 
M00027319D:Bll 
M000056I4B:B01 


CH09LNL 
CH09LNL 
CH04MAL 
CH04MAL 
CH02COH 


430 

431 

432 

433 

434 

435 

436 

437 

438 

439 

440 

441 

442 

443 

444 

445 

446 

447 

448 

449 

450 

451 

452 

453 

454 

455 

456 

457 

458 

459 

460 

461 

462 

463 

464 

465 

466 

467 

468 

469 

470 


400084 
375643 
166493 
379632 
373234 
401230 
186623 
127714 
451857 
404620 
186872 
42729 
373380 
374465 
403557 
16749 
375592 
376103 
40228 
374606 
378270 
236321 
378676 
373252 
384601 
403772 
379566 
136202 
14317 
375349 
403020 
374060 
183399 
373789 
20168 
452641 
431370 
153044 
378229 
374328 
39606 


RTA00002685F.o.l9.2.P.Seq 
RTA00002676F.h. 18.2. P.Seq 
RTA00002663F.h.08. 1. P.Seq 
RTA00002682F.h.l4.l.P.Seq 
RTA00002676F.g.l5.2.P.Seq 
RTA00002685F.i.05.2. P.Seq 
RTA000027l2F.f.l5.I.P.Seq 
RTA00002712F.k. 14.1. P.Seq 
RTA00002692F.a.0 1 . 1 .P.Seq 
RTA00002687F.c.03.2.P.Seq 
RTA00002663F.k.23. 1 .P.Seq 
RTA00002709F.C.06.2. P.Seq 
RTA00002674F.b.07.l.P.Seq 
RTA00002673F.C.07. 1. P.Seq 
RTA00002687F.d.l0.2.P.Seq 
RTA00002709F.b. 14.2.P.Seq 
RTA00002630F.f.22.2.P.Seq 
RTA00002676F.g.06.2.P.Seq 
RTA000O2712F.1. 18.1. P.Seq 
RTA0O0O2673F.].23.1. P.Seq 
RTA00002630F.h.OS.:.P.Seq 
RTA00002668F.k. 14. t. P.Seq 
RTA000026SOF.m.20.2.P.Seq 
RTA00002670F.k.l6.2.P.Seq 
RTA00002670F.k.06.2. P.Seq 
RTA00002687F.a.03.2.P.Seq 
RTA000O26S3F.k.04.1. P.Seq 
RTA000O2637F.p.0-V t .P.Seq 
RTA0OOO27 1 3F.c. 13.1. P.Seq 
RTA00002672F.j.l I.I. P.Seq 
RTA00002687F.3.02.2. P.Seq 
RTA00002672F.L07.1. P.Seq 
RTA00002712F.0. 10.1. P.Seq 
RTA00OO2671F.C.20.!. P.Seq 
dt \ nnnrn" i i p h 11 t P Sea 
RTA0OOO2692F.d.05.:. P.Seq 
RTA00002669F.m.04.:.P.Sec 
RTA0O0O2713F.j.03.i. P.Seq 
RTA00002679F.C. 16.1 P.Seq 
RTA00002673F.d.0 1.1. P.Seq 
RTAOOO027l3F.i.20.1. P.Seq 


1 F 
F 


M00039641C:D07 

M00039301B:F06 

M00022492C:A02 

M00039984B:G12 

M00039297C:H08 

M00039533A:C12 

M00026843B:D10 

M00027018A:C09 

M00042534B:C10 

M00039770A:Gll 

M000227Q7B:G08 

M00005458A:FI 1 

MO0OJ9!23A:31O 

M00039058CH02 

M00039943A:E03 

M00005402B:F08 

M000J97*)SD:E10 

M000392953:D03 

M00027049B:F05 

M0003W6A:A05 

M000;^S01A:H1 1 

M00033034C:F02 

M0003«327B:F07 

M00033451A:HOl 

M00033446C:G08 

M00039746C:G09 

M00040031C:E0t 

M00040349D:B09 

M00027248A:C02 

M000.^O:4B:B10 

M0003°746C:A08 

M0003°014B:C04 

M00027I36CC09 

M0003S259B:G08 

M000::S34B:G1 1 

M00043003CD08 

M000332S3B:D12 

M000:"4"6A:C09 

MO003°o63C:G09 

MQ003°061B:FO8 

MO00:"46SA:CO9 


CH12EDT 

CH09LNL 

CH03MAH 

CH09LNL 

CH09LNL 

CH12EDT 

CH04MAL 

CH04MAL 

CHI SCON 

CHI4EDT 

CH03MAH 

CH02COH 

CH09LNL 

CH09LNL 

CH14EDT 

CH02COH 

CH09LNL 

CH09LNL 

CH04MAL 

CH09LNL 

CH09LNL 

CHOSLNH 

CH09LNL 

CH09LNL 

CH09LNL 

CHI4EDT 

CH09LNL 

CH14EDT- 

CH04MAL 

CH09LNL 

CH14EDT 

CH09LNL 

CH04MAL 

CH09LNL 

CH03MAH 

CHI SCON 

CHOSLNH 

CH04MAL 

CH09LNL 

CHOQLNL 

CH04MAL 



WO 01/02568 



PCT/US00/18374 



1U 


LLUSTER 


iNAivlt 


/~\ r\ ir\iT i "T" T K 


CLONE ID 


LIBRARY 


471 


59077 


RTA000027ljF.n.0l. LP.^ec 


F 


M00027596C:E06 


CH04MAL 


472 


1935 


RTA00002710F.b. 1 I.l.P.Seq 


F 


M00008006B:B03 


CH03MAH 


473 


379684 


RTA0000268 1 F.c.09. 1 .P.Seq 


F 


M0OO39851B:Gl 1 


CH09LNL 


474 


45 1 564 


RTA0000269 1 F.f. 1 2.2. P.Seq 


F 


M0004341 ID:H06 


CH17COHLV 


475 


7571 


RTA000027 1 OF.a. I o . L P.Seq 


F 


M00007943D:C09 


CH03MAH 


476 


129323 


RTA000027l3F.k.21. 1 .P.Seq 


F 


M00027525B:D06 


CH04MAL 


477 


12960 


RTA000027 1 OF.a.2 j . 1 . P.Seq 


F 


M00007976A:CI0 


CH03MAH 


478 


186730 


RTA000027 1 j F.o.Od . 1 .P. Sec 


F 


M00027641C:A03 


CH04MAL 


479 


59077 


RTA000027bF.m.24. 1. P.Seq 


F 


M00027596C:E06 


CH04MAL 


480 


185884 


RTA00002712F.b.06, 1 .P.Seq 


F 


M000233I6C:G08 


CH04MAL 


481 


19471 


RTA00002 /08F.g.08. 1 .P.Seq 


F 


M00004I97B:H10 


CH01COH 


482 


45206 


RTA000027 IOF.c.06. 1 .P.Seq 


F 


M00008063B:A06 


CH03MAH 


483 


404257 


RTA00002687F.g.06.2. P.Seq 


F 


M00040208A:C03 


CH14EDT 


484 


372997 


RTA00002679F.p.04. 1 .P.Seq 


F 


iM00039729A:AI0 


CH09LNL 


485 


43792 


RTA000027I3F.k.l6.1.P.Seq 


F 


M00027520A:C05 


CH04MAL 


486 


400052 


RTA00002687F.h. 1 3.2. P.Seq 


F 


M00040291D.C05 


CHI4EDT 


487 


452194 


RTA00002692F.C. 14.2. P.Seq 


F 


M00042988A:F06 


CHI8CON 


488 


24034 


RTA000027lOF.b.06. 1 P.Seq 


F 


M00007992CF06 


CH03MAH 


489 


447544 


RTA00002689F.e. 1 8. 1 .P.Seq 


F 


M00042905D:D02 


CH15CON 


490 


401872 


RTA00002636F.C.23. LP.Seq 


F 


M0004014ID:F05 


CH13EDT 


491 


376553 


RTA00002674F.g. 19. 2. P.Seq 


F 


M00039139A:C09 


CH09LNL 


492 


455051 


RTA00002694F.a.07. 1. P.Seq 


F 


M00042595A:A1 1 


CH20COHLV 


493 


16760 


RTA00002708F.j.03. LP.Seq 


F 


M00004393B:E07 


CH01COH 


494 


374174 


RTA00002672F.L 1 2.2. P.Seq 


F 


M00039015A:D07 


CH09LNL 


495 


374283 


RTA000026 /2F.k.2 1 .2. P.Seq 


F 


M00039030B:E02 


CH09LNL 


496 


375772 


RTA000026S 1 F.o.24. 1 .P.Seq 


F 


M00039909C:GO5 


CH09LNL 


497 


376417 


RTA00002678F.i.03.2. P.Seq 


F 


M00039477D:A 10 


CH09LNL 


498 


428971 


RTA00002666F.O.02. LP.Seq 


F 


M0003267SCD06 


CH08LNH 


499 


394098 


RTA0000268 1 F.j. I LP.Seq 


F 


M000398S7C:E07 


CH09LNL 


500 


379761 


RTA000026/0F.n.0j. 1 .P.Seq 


F 


M00033561C:A02 


CH09LNL 


DO I 


374266 


RTA000026/4F.1.08. 1 .P.Seq 


F 


M00039144C:E06 


CH09LNL 


CAT 


372946 


RTA000026/0F .1.07. LP.Seq 


F 


MO0033457D:AO5 


r \ f\r\.t v ri 

CH09LNL 


50j 


228909 


■ ri"T* i A rtAA^ £ £. \ V , AO 1 DC» — 

RTA00002664F.e.08. 1 .P.Seq 


F 


M0002708^C:E1 1 


CH04MAL 


304 


427524 


RTA0000266^F.e.05. LP.Seq 


F 


M00028354D:A03 


CH08LNH 


50o 


380413 


RTA00002680F.K. 19.2. P.Seq 


F 


M000398I6C:D05 


CH09LNL 


506 


373866 


RTA000026 / 1 F.c.24.2 . P.Seq 


F 


M00038259C:H09 


CH09LNL 


CA"7 


427202 


RTA0000266^F.g. ID. 1 .rbeq 


F 


M000286 1 7 C:A 12 


CH08LNH 


508 


373000 


RTA000026 /OF.j. 1 j. 1 .P.beq 


F 


M00033437C:CO3 


CH09LNL 




J /OOJO 


I\. 1 r\UUUU-0 / or. p. i i.-.r.oeq 


r 
r 


MnAAlOAt"*"- \ in 
IVIUUU j VO- /L . A I U 


r-unor XI! 
L. MUVLiNL 


510 


24945 


RTA00002710F.p. 05. LP.Seq 


F 


M00022739A:B03 


CH03MAH 


511 


20277 


RTA00002710F.e. 17. LP.Seq 


F 


MO0O:i972D:Cl 1 


CH03MAH 


512 


20820 


RTA000027iOF.e.02. LP.Seq 


F 


M0002!9I*5C:AIO 


CH03MAH 


513 


376791 


RTA00002674F.]. 17.2, P.Seq 


F 


M00039i66B:G06 


CH09LNL 


514 


9S09 


RTA00002710F.g. 12. LP.Seq 


F 


MO00::i73B:D06 


CH03MAH 


515 


429562 


RTA00002667F.m.03. LP.Seq 


F 


M00032853D:G12 


CH08LNH 


516 


12920 


RTA000027lOF.e. 15. LP.Seq 


F 


M00021964C:E10 


CH03MAH 


517 


377565 


RTA000026S4F.h.l9.l.P.Seq 


F 


M000-10309A:E1 I 


CH09LNL 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEO NAME 


ORIENTATION 


CLONE IV 


r ID O A D V 


518 


429356 


RTA00002668F.d.23.I.P.Scq 


F 


M00032933A:C10 


CH08LNH 


519 


427634 


RTA00O02665F.f.O9.1. P.Seq 


F 


M00028369D:E08 


LH08LNH 


520 


427713 


RTA0OOO2665F.e.23.I.P.Seq. 


F 


M00028364C:G08 


(~ T I A O 1 VfTJ 

CH08LNH 


521 


373607 


RTA00002674F.d.l5.2.P.Seq 


F 


M00039I27D:E10 


rrjnnr vrr 

CH09LNL 


522 


378781 


RTA00002674F.o.l4.I.P.Seq 


F 


M000391963:H06 


f t r a A t vil 

CH09LNL 


523 


429361 


RTA00002666F.d. 11.1 .P.Seq 


F 


M00032550D:C02 


CH08LNH 


524 


126754 


RTA00002663F.a. 1 6. 1 .P.Seq 


F 


M00003045A:H02 


CHOjMAH 


525 


428047 


RTA00002665F.k.l0.1.P.Seq 


F 


M00031417C:G09 


/— rr A O I V I LI 

CH08LNH 


526 


18863 


RTA00002709F.d. 15.1 .P.Seq 


F 


M00005625A:C02 


CH02COH 


527 


379761 


RTA00002670F.n.03.2. P.Seq 


F 


M00033561C:A02 


CHUVLNL 


528 


46407 


RTA00002665 Fx. 10.1. P.Seq 


F 


M00023196D:A03 


CH08LNH 


529 


21365 


RTA00002709F.k.06. 1 .P.Seq 


F 


M00007012D:H08 


CH02LUH 


530 


427466 


RTA00002665F.b. 1 1.1. P.Seq 


F 


M0002SI84D:GIO 


CH08LNH 


531 


400265 


RTA00002685F.C.03.2. P.Seq 


F 


M00039374B:B07 


CH 1 2EDT 


532 


380056 


RTA00002680F.a.l6.2.P.Seq 


F 


M00039773D:F1 1 


CHU9LNL 


533 


375324 


RTA00002678F.1.12.2.P.Seq 


F 


M000396I2B:B10 


/— ' r f A A I Vtf 

CH09LNL 


534 


25165 


RTA00002710F.k.l7.1. P.Seq 


F 


M000224963:E12 


CHOjMAH 


535 


401296 


RTA0O002685F.h.23.2.P.Seq 


F 


M00039529C:D07 


CH 12EDT 


536 


394098 


RTA0O0026SIFJ. 15.2. P.Seq 


F 


M0003988"C:E07 


f J A A 1 VI I 

CH09LNL 


537 


17430 


RTA00002710F.U 1.1. P.Seq 


F 


M00022365D:A03 


CHOjMAH 


538 


373820 


RTA00002674F.d.06.l. P.Seq 


F 


M00039I27A:GI I 


CH09LNL 


539 


378548 


RTA0O002672F.a.l4.2.P.Seq 


F 


M00039004 3:C11 


CH09LNL 


540 


222679 


RTA00002664F.f. 18.2. P.Seq 


F 


M0002"22SD:A0l 


CH04MAL 


541 


376874 


RTA00002670F.e.23.2.P.Seq 


F 


M00033375A:G04 


CHOvlNL 


542 


21329 


RTA00002709F.b.08. 1 .P.Seq 


F 


M00005379A:E04 


pnrti f~" f~\ i_f 


543 


119905 


RTA000027lOF.p.l3.l.P.Seq 


F 


M00022785C:G06 


f LI C\ ** \ ,1 A LI 


544 


377028 


RTA00002678F.n.2 1 .2. P.Seq 


F 


M00039631 A:CI0 




545 


373351 


RTA0000267 1 F.l. 18.3. P.Seq 


F 


M00038327D:A05 


/— n \\c\ i vtr 


546 


376082 


RTA00002674F.m. 1 7. 1 .P.Seq 


F 


M00039171 3:Dl 1 


t— IT A A I Vfi 


547 


376987 


RTA00002678F.g.2 1 .2.P.Seq 


F 


M00039472C:B08 


f~" i_rr\n I vil 


548 


61921 


RTA0O002661F.S.0S.I. P.Seq 


F 


M000039953:E03 


CHU lLUn 


549 


373486 


RTA00002672F.b.03.2. P.Seq 


F 


M00038635B:COS 


f t j r\C\ i vll 
CHUVLNL 


550 


380355 


RTA00002670F.o.06.2.P.Seq 


F 


M00033570C:C10 


/~ LJ A A I V(I 

CHU9LNL 


551 


430295 


RTA00002667F.h.l4.l.P.Seq 


F 


M00032S08B:G10 


rUAQI VI LT 


552 


379221 


RTA00002682F.n.0 1.1. P.Seq 


F 


M00040017D.G03 


! I A A f VII 

CHU9LNL 


553 


373532 


RTAOO0O2672F.d. 1 0.2. P.Seq 


F 


M0003S991 A: DO I 


/*" nno i vfi 
CHUVLNL 


554 


375633 


RTA00002677F.rn.05. 2. P.Seq 


F 


M00039417B:f01 




555 


378356 


RTA00002681F.f.07.l. P.Seq 


F 


M00039S66B:A08 


v- riUvL.NL 


556 


376196 


RTA00002674F.m. 12.1. P.Seq 




M00039I70CF05 


Tjno i vi 
v-HUvLNL 


557 


3751 15 


RTAUwUU-0 t Jr. □ .-<+ — .r .jcq 


^ 


M00039066D:G08 


CH09LNL 


558 


375115 


RTA0OO02673F.e.0l.2.P.Seq 




M00039066D:G08 


CH09LNL 


559 


378600 


RTA00002679F. i. 03.1. P.Seq 




M00039686GE06 


CH09LNL 


560 


375351 


RTA00O026S0F.e.l5.1.P.Seq 




M00039792A.304 


CH09LNL 


561 


25237 


RTA0O00271OF.n.23.I.P.Seq 




M00022671B:A08 


CH03MAH 


562 


193503 


RTA00002663F.n. 15.1. P.Seq 




M00023039D:305 


CH03MAH 


563 


428268 


JlTA00002667F.b.0 1 . ! .P.Seq 




M00032724A.C05 


CH08LNH 


564 


379440 


RTA00002683F.j.2!.2.P.Seq 




M00040080C:C06 


CH09LNL 



PCTAJSOO/18374 

WO 01/02568 



SEQ 
ID 
565 
566 
567 
568 
569 
570 
571 
572 


CLUSTER 
374502 
240615 
379207 
427893 
377530 
429707 
427610 
100699 


SEQ NAME < 
RTA00002673F,i.08.2.P.Seq 
RTA00002672F.e.l9.2.P.Seq 
RTA00002670F.b.07.2.P.Seq 
RTA00002665F.k. 19.1. P.Seq 
RTA00002684F.g.l9.2.P.Seq 
RTA00002668F.C. 11.1 .P.Seq 
RTA00002665F.i.04.l.P.Seq 
RTA00002662F.b.22.2.P.Seq 


ORIENTATION 


CLONE ID 
M00O39080C:HUO 
M00038995D:E05 
M00033306D:G08 
M0003I419D:C04 
M00040305A:DH 
M00032918C:BIO 
M00028770A:D04 
M00006680B:D02 
M00040017A;C06 


LIBRARY 
CH09LNL 
CH09LNL 
CH09LNL 
CH08LNH 
CH09LNL 
CH08LNH 
CH08LNH 
CH02COH 
CH09LNL 


573 

574 

575 

576 

577 

578 

579 

580 

581 

582 

583 

584 

585 

586 

587 

588 

589 

590 

591 

592 

593 

594 

595 


378974 
373607 
262951 
30748 
161116 
379211 
430689 
374122 
376521 
372834 
379014 
376344 
376485 
21661 
376539 
43 1645 
163293 
178614 
373274 
379820 
160536 
373313 
26429 


RTA00002682F.m.2 1.1. P.Seq 
RTA00002674F.d.I5.l.P.Seq 
RTA00002665F.d.04.3. P.Seq 
RTA00002713F.e.lI.l.P.Seq 
RTA000027 1 4F.c. 1 L 1 .P.Seq 
RTA00002682F.p.20.1. P.Seq 
RTA00002669F.i.24.I.P.Seq 
RTA00002673F.1.22.2. P.Seq 
RTA00002677F.h.06.2.P.Seq 
RTA00002670F.b.l2.2.P.Seq 
RTA00002682F.O.02. 1 .P.Seq 
RTA00002677F.b.lS.2.P.Seq 
RTA00002676F.f.Ol.2.P.Seq 
RTA00002709F.e. 18.1. P.Seq 
RTA00O02675F.b.IS.t.P.Seq 
RTA00002669F.h. 15.3. P.Seq 
RTA000027I4F.C.20.1. P.Seq 
RTA00002713F.C.20.1. P.Seq 
RTA00002670F.i.22.2.P.Seq 
RTA00002679F.f. 15. 1 .P.Seq 
RTA00002663FT.10.1. P.Seq 
RTA00002671F.m.02.1. P.Seq 
RTA000027l2F.k.23.l.P.Seq 




M00039127D:E!0 

M00028215D:F03 

M00027301B:B08 

M00027837CD09 

M00040029A:G04 

M00033243B:A05 

M00039104D:C09 

M00039398A:BIO 

M00033308B:GO5 

M00040022C:D06 

M00039340B:G08 

M000392S8CB11 

M00005S20C:E04 

M000392ilA:C12 

M00033223B:H07 

M00028120D:F12 

M00027263A:F10 

M00033432B:H10 

M00039677A:B08 

M00022233CA12 

M00038328D:A03 

M00027022D:GU 

M00022979A:D05 


CH09LNL 

CH08LNH 

CH04MAL 

CH04MAL 

CH09LNL 

CH08LNH 

CH09LNL 

CH09LNL 

CH09LNL 

CH09LNL 

CH09LNL 

CH09LNL 

CH02COH 

CH09LNL 

CH08LNH 

CH04MAL 

CH04MAL 

CH09LNL 

CH09LNL 

CH03MAH 

CH09LNL 

CH04MAL 

CH03MAH 


596 
597 
598 
599 
600 
601 
602 
603 
604 
605 
606 
607 
608 
609 
610 
611 


17983 
375388 
63005 
23030 
372946 
375351 
374502 
376911 
376024 
377194 
37Q643 
379610 

25613 

207466 

400052 

21290 


RTA00002711F.f.I0.l.P.Seq 
RTA00002681F.j.22.2.P.Seq 
RTA000027l2F.m.21.1.P.Seq 
RTA00002709F.b. 1 0.2.P.Seq 
RTA00002670F. 1.07. 2. P.Seq 
RTA00002680F.e.l5.2.P.Seq 
RTA00002673F.L08.1. P.Seq 
RTA00002682F.C.09.1. P.Seq 
RTA000O2675F.n. 15.1 P.Seq 
RTA00002679F.h.20.1. P.Seq 
RTA00002632F.g.0S.l.P.Seq 
RTA000026S0F.k.l 1.1. P.Seq 
RTA0000271 IF.2.06. l.P.Seq 
RTA00002664F.J.08.1. P.Seq 
RTA00002687F.h. 13-1. P.Seq 
RTA00002712F.2.0 1.1. P.Seq 




M00039838B.D03 
M00027094A:B03 
M00005384A:Cil 
M00033457D:A05 
M00039792A:B04 
M00039080C:H06 
M00039938CA08 
M00039257D:C03 
M00039635.V.A0S 
M00039978A:G03 
M00039815C:F09 
M00023024D:F12 
M00027733A:AQ2 
M00040291D:C05 
M00026S59D:D01 


CH09LNL 

CH04MAL 

CH02COH 

CH09LNL 

CH09LNL 

CH09LNL 

CH09LNL 

CH09LNL 

CH09LNL 

CH09LNL 

CH09LNL 

CH03MAH 

CH04MAL 

CH14EDT 

CH04MAL 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


612 


375975 


RTA00002675F-n.13.LP.Seq 


F 


M00039258D:B08 


CH09LNL 


615 


46804 


RTA000027l2F.n.l9.1.P.Seq 


F 


M00027I2ID:C05 


CH04MAL 


614 


69863 


RTA00002712F.U8.1. P.Seq 


F 


M0OO26935C.B04 


CH04MAL 


615 


375285 


RTA00002676F.g.l8.2.P.Seq 


F 


M00039298B:B06 


CH09LNL 


616 


373000 


RTA00002670Fj.l3.2.P.Seq 


F 


M00033437CC03 


CH09LNL 


617 


378679 


RTA0000268IF.f.l6.1.P.Seq 


F 


M0OO39869B:F06 


CH09LNL 


613 


45407 


RTA000027l2F.k.l l.l.P.Seq 


F 


MOOO27016A:B06 


CH04MAL 


619 


16838 


RTA00002712F.e.23.1.P.Seq 


F 


M00026803A:F08 


CH04MAL 


620 


186425 


RTA00002713F.C.04. 1 .P.Seq 


F 


M00027236A:E04 


CH04MAL 


621 


376485 


RTA00002676F.e.24.2.P.Seq 


F 


M00039238C:BII 


CH09LNL 


622 


41108 


RTA000027 1 2F.n. 12.1 .P.Seq 


F 


M00027I03CB03 


CH04MAL 


623 


430876 


RTA00002669F.C.02.I. P.Seq 


F 


M00033!36C:DM 


CH08LNH 


624 


185716 


RTA000027 13F.I.07. 1. P.Seq 


F 


M00027557CBOI 


CH04MAL 


625 


85338 


RTA00002712F.b.l8.1. P.Seq 


F 


M0OO23333D:C12 


CH04MAL 


626 


185597 


RTA00002713F.m.23.LP.Seq 


F 


M00027596A:A10 


CH04MAL 


627 


139348 


RTA000027l3F.k.23.1. P.Seq 


F 


M00027526D:F03 


CH04MAL 


628 


454665 


RTA00002693F.d.l5.2.P.Seq 


F 


tV10004316-iC:EI2 


CH19COP 


629 


186387 


RTA000027I3F.1.01.I.P.Seq 


F 


M00O2752SC:BI0 


CH04MAL 


630 


186387 


RTA00002713F.k.24.I.P.Seq 


F 


M00027523C.BI0 


CH04MAL 


631 


21093 


RTA00002708F.h.20.l. P.Seq 


F 


M0000430SC:C06 


CHOICOH 


632 


20827 


RTAO0OO27I0F.C.23.1. P.Seq 


F 


M0002167ID:F!2 


CH03MAH 


633 


21290 


RTA000027i2F.f.24.1. P.Seq 


F 


M0002685^D:D01 


CH04MAL 


634 


17646 


RTA000027!0F.d.22. 1. P.Seq 


F 


M00021903D:G12 


CH03MAH 


635 


402817 


RTA00002686F.3. 10.1. P.Seq 


F 


M00039736D:G08 


CH13EDT 


636 


42354 


RTA000027 1 3F.n.09. 1 .P.Seq 


F 


M000276I5A:F10 


CH04MAL 


637 


430876 


RTA00002669F.C.02.3. P.Seq 


F 


M00033I86C:DII 


CH08LNH 


638 


378641 


RTA00002679F.a.2 1 .2.P.Seq 


F 


M0OO396?;C:E08 


CH09LNL 


639 


375848 


RTA00002674F.m.03.2.P.Seq 


F 


M0003916SCA04 


CH09LNL 


640 


36165 


RTA00002708F.i.06.l.P.Seq 


F 


M00004340C:C07 


CHOICOH 


641 


456506 


RTA00002694F.d.05. 1 .P.Seq 


F 


MOOO434Q2A:E01 


CH20COHLV 


642 


374450 


RTA00002672F.i.05.2.P.Seq 


F 


M000390UA:H10 


CH09LNL 


643 


378949 


RTA000026S3F.o.2l.2.P.Seq 


F 


MOOO4OI0OD:B06 


CH09LNL 


644 


. 373313 


RTA 0000267 lF.m.02.2. P.Seq 


F 


. M0OO38328D:A03 


CH09LNL 


645 


377861 


RTAOOO0268IF.m.20.I. P.Seq 


F 


M000398Q3A:A08 


CH09LNL 


646 


431196 


RTA00002669F.f.07.2.P.Seq 


F 


M0003320-iB:A07 


CH08LNH 


647 


372795 


RTA00002633F.a.06.l.P.Seq 


F 


M000400;:.A:B03 


CH09LNL 


648 


42340 


RTA0OO0266IF.b.03.!.P.Seq 


F 


M0000I43*?C:H06 


CHOICOH 


649 


374410 


RTA00002674F.k.l l.l.P.Seq 


F 


M0OO39I5SB:GI2 


CH09LNL 


650 


374623 


RTA00002674F.a.OI.2.P.Seq 


F 


M00O391 13D:A06 


CH09LNL 


651 


431612 


RTA00002669F.e.23.2.P.Seq 


F 


MO0O332OZD:G06 


CH08LNH 


652 


240615 


RTA00002672F.e.l9.1. P.Seq 


F 


M00O389OfD:E05 


CH09LNL 


653 


423508 


RTA0OO02666F.d.0I.I.P.Seq 


F 


M00032545B:H09 


CH08LNH 


654 


235780 


RTA00002666F.d.03.1.P.Seq 


F 


M0003254fD:G05 


CH08LNH 


655 


17390 


RTA00002710F.e.l l.l.P.Seq 


F 


M0002I955A:H02 


CH03MAH 


656 


20100 


RTA000027IOF.g.l l.l.P.Seq 


F 


M00022r:D:D!.2 


CH03MAH 


657 


4458 


RTA 000027 lOF.g. 1 3.1. P.Seq 


F 


M000221S-C:C1! 


CH03MAH 


658 


373347 


RTAOOOO268IF.h.07.2.P.Seq 


F 


M0OO398"5D:A10 


CH09LNL 



WO 01/02568 



PC17US00/18374 



SEQ 
ID 



CLUSTER 



SEQ NAME 



ORIENTATION 



CLONE ID 



LIBRARY 
CH09LNL 



659 



373477 



RTA00002672F.b.23.I.P.Seq 



M0QQ38639B:C03 
M00022133C:B05 



660 



15596 



RTA000027 1 QF.g.02. 1 .P.Seq 



CH03MAH 
CH02COH 



661 



662 



663 



21028 



RTA0OOO27O9F.l.Q9.1.P.Seq 



374063 



RTA00002672F.h.I5.2.P.Seq 



380686 



RTA00002684F.a.03.2.P.Seq 



M000071Q8B:A02 



M0003901 1D:C10 



M00040107B:H07 



CH09LNL 



CH09LNL 



402950 



RTA00002686F.g.ll.l.P.Seq 



M00040181B:H09 
M0003I4S5D:G02 



CH13EDT 
CH08LNH 



665 



428064 



RTA00002665F.LQ4.1. P.Seq 



23310 



RTA00002708F.e. 1 0. 1 .P.Seq 



M00004046C:A08 
M00039339C:F03 



CH01COH 
CH09LNL 



667 



376233 
375848 



RTA00002677F.b. 1 5.2.P.Seq 



RTA00002674F.m.Q3.1. P.Seq 



M00039168C:A04 
M00028772C:B09^ 



669 



670 



24225 1 



RTA00002665F.i.08. i .P.Seq 



374064 



RTA000Q2672F.f.l5.2.P.Seq 



M00038999D:Cll 



CH08LNH 
CH09LNL 



146260 



RTA00002663F.d. 1 7. 1 .P.Seq 



M00022099B:D06 



CH03MAH 
CH09LNL 



672 



673 
674 



375575 



RTA00002677F.e.22.2.P.Seq 



M00039385B:E09 



355518 



RTA00002665F.c.l5.3.P.Seq 



M00028201B:H12 



CH08LNH 
CH02COH 



184223 



RTA00002662F.b.08.2.P.Seq_ 



MQ0005539D:G0t 
M00027073A:B02 



675 



213306 



429566 



RTA00002664F.e.07.2.P.Seq 



RTA00002668F.b.Q4. 1 .P.Seq 



M00032907A:G04 



CH08LNH 
CH09LNL 



677 
678 



378656 



RTA000026S2F.C.09.1. P.Seq 



M00039927A:F04 
M00032940A:C02 



427760 



RTA00002668F.e.23.1. P.Seq 



679 



680 



681 
682 
683 



372795 



RTA0OOO2683F.a.06.2.RSeq 



M00040032A:B03 



429340 



RTA00002666F.f. 12.1. P.Seq 



M00032577A:C04 



CH08LNH 



429822 



RTA0O002668F.e.l7.1. P.Seq 



M00032939B:E07 
M00039783B:A06 



CH08LNH 
CH09LNL 



375224 



RTA00002680F.d.22.2.P.Seq 



RTA0000268 1 F.h.07. 1 .P.Seq 



M00039875D:A1Q 
M0003998"C:C08 



684 



685 
686 



380109 



RTA00002682F.i. 17.1 P.Seq 



RTA00002683F.Q.02.1. P.Seq 



M00040097A:CI2 



CH09LNL 



687 



688 
689 



375348 



RTA00Q02676F.i.l2.3.P.Seq 



377889 
429883 



RTA00002672F,c.08.2.P.Seq 



RTA00002667F.g.05.1. P.Seq 



M00039304D:B09 



M0003866IA;A07 



CH09LNL 



M00032793A:F06 



CH08LNH 



377067 
378001 



RTA00002682F124.1. P.Seq 



M00040014B:DQ1 
M00039898D:C06 



690 
691 



RTA0000268 1 F.m.22.2.P.Seq 



CH09LNL 
CH03MAH 



RTA0000271 OF j.2 1.1. P.Seq 



M00022433A:E02 
M00039793D:C05 



692 
693 



RTA00Q0268QF.f.03.1. P.Seq 



377861 



RTA00002681F.m.20.2.P.Seq 



M0Q039S9SA:A08 
M00032766C:A04 



694 
695 



428610 



RTA00002667F.e.09.1. P.Seq 



20765 



RTA000027 1 OF. i. 10.L PSeq 



MO0O22363C:G12 



27601 



RTA000027 13F.e.23. LP.Seq 



M000273UC:D09 



CH04MAL 



697 
698 



RTA00002668F.o.20.2.P.Seq 



M00033I40D:F06 
M00033424B:A04. 



381024 



16454 



RTA00002670F.h.23.2.P.Seq 



RTA00002709F.f.07.1 .P.Seq 



M000065Q9D:B02 
M00033424D:HI2 



CH02COH 
CH09LNL 



700 
701 



372898 



RTA000O2670F.i.03.2.P.Seq 



702 
703 



373681 



RTA000026" 1 F.d.20.1 .P.Seq 



M0003S272D:F1 1 



RTA00OO2684F.h.06.2.P.Seq 



M0004030"B:F01 



CH09LNL 
CH09LNL 



704 
705 



377343 



RTA00002684F.g.04.2.P.Seq 



374747 
185848 



J<TAOO0O2676F.e.07.2.P.Seq 



M0004030:C:AQ4 
M000392S6A;C06 



CH09LNL 
CH04MAL 



RT A00002 " 1 2F.m.U.l. P.Seq 



M000270S0A:B01 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME < 


ORIENTATION 


CLONE ID 


LIBRARY 


706 


374311 


RTA0O0O2676F.e.l8.2.P.Seq 




M000392!TC:AUO 


CH09LNL 


707 


278923 


RTAOOOOJoo / r .D. i u. i .roeq 




M00032726CC0I 


CH08LNH 


708 


378667 


RTA0000268lF.b.ll.2.P.Seq 


E 


M0003984 _ A:F06 


CH09LNL 


709 


380454 


RTA00002673F.j.I6.I.P.5eq 




M0003908-iD:D07 


CH09LNL 


710 


381576 


RTA00002670F.L04.2. P.Seq 




M00033425A:C10 


CH09LNL 


711 


375067 


RTA00002675F.O.03.1. P.Seq 


F 


M00039260C:G03 


CH09LNL 


712 
713 
714 


89706 
10583 
379982 


RTA00002714F.a.t 1. 1. P.Seq 
RTA00002 / 1 1 r n. 1 1 • I .r.^eq 
RTA00002632F.i. 1 6. 1 .P.Seq 


p 

F 


MOO(P77413 F09 
M00023100A:£12 
M00039987CEI2 


CH04MAL 
CH03MAH 
CH09LNL 


715 
716 
7!7 


378532 
379776 
374136 


RTA00002o80r.n.O4.j.K.3eq 
RTAO0002680F.a.22.2.P.Seq 
RTA00002673F.f. 16.1. P.Seq 




M000398288:C05 
M00039774C:A03 
M00039072C:C03 
M00022670D:H1 1 


CH09LNL 
CH09LNL 
CH09LNL 
CH03MAH 


718 
719 


98471 
125365 


RTA00002663F.j.2 I.I. P.Seq 
RTA00002668F.j.07.l. P.Seq | 




M000330i9B:El0 


CH08LNH 


720 


375431 


RTA00002680F.f.03.2.P.Seq 




M00039793D:C05 


CH09LNL 


721 
722 


62326 
379972 


RTA00002661F.S.20.I. P.Seq 
RTA00002679F.e. 10.!. P.Seq 




M00004105 D:D05 
M00039672D:D10 


CH0ICOH 
CH09LNL 


723 
724 
725 
726 
727 
728 
729 
730 
731 
732 
733 


377554 
230479 

98872 

42635 . 
379044 

96093 
403642 
400921 

93587 

79951 

176509 


RTA00002679F f. 10.1. P.Seq 
RTA00002664F.C. 16.2. P.Seq 
RTA00002663 Fj.19.1. P.Seq 
RTA000026/9F.n. IS. 1 .r.beq 
RTA000026~9F.a. 10.2. P.Seq 
RTAOOO02663F.J.07. 1 .P.Seq 
RTA00002687F.d.Ol.2.P.Seq 
RTA00002685F.b. 18.2. P.Seq 
RTA00002663F.k.l0.1.P.Seq 
RTA0000271 3 F.c. 18.1. P.Seq 
RTA00002686F.b.09.l.P.Seq 


_.F 


M00039675D:B03 
M00026915B:C06 
M0002266S8:B12 
M00039684D:B08 
M00039652B:D05 
M00022640CC12 
M00039945CF09 
M00039371B:H06 
M0002273IA:D02 
M00027258A:A07 
M00039756B.H06 


CH09LNL 
CH04MAL 
CH03MAH 
CH09LNL 
CH09LNL 
CH03MAH 
CH14EDT 
CH12EDT 
CH03MAH 
CH04MAL 
CHUEDT 


734 
735 
736 
737 
738 
739 
740 
741 
742 
743 


451753 
186266 
235052 
377233 
378532 
177932 

9332 
240318 
404260 

93767 


RTA00002694F.e.06. 1 .P.Seq 
RTA00002713F.c.l6.1.P.Seq 
RTA00002692F.a. 1 5.2.P.Seq 
RTA00002682F.e.23.1. P.Seq 
RTA00002680F.n.04.2.P.Seq 
RTA00002 / 1 j r .D.i-. i r.oeq 
RTA00002712F.p. 18.1. P.Seq 
RTAOOOO-bo / r .a.lH.-.r oeq 
RTA00002687F.cII.2-P.Seq 
RTA000027 1 2F.2.09. 1 .P.Seq 


F ■ 
1 


M00043634A:C10 
M000272563:H09 
M00042626B:D08 
M00039^40D:G08 
MO0O39828B:CO5 
M00027233B:COt 
M00027179D:E06 
M00039947A:D06 
M00039942D:C01 
M00O26S68C:£U 
M00026856D:F02 


CH20COHLV 
CH04MAL 
CHI SCON 
CH09LNL 
CH09LNL 
CH04MAL 
CH04MAL 
CHI4EDT 
CHUEDT 
CH04MAL 
CH04MAL 


744 
745 
746 


185642 
447544 
403274 


RTA00002712F.f.20.l. P.Seq 
RTA00002689F.e.l3.;.P.Seq 
RTA00002687F.b. 10.2. P.Seq 




M00042905D:D02 
M00039766A:G07 


CH15CON 
CHUEDT 


747 
74S 


404257 
403868 


RTA0O0O2637F.g.06.1. P.Seq 
RTA00002687F.k.05.2. P.Seq 




M00O402O8A:CO3 
M000403ISC:H1 1 


CHUEDT 
CHUEDT 


749 
750 
751 
752 


450074 
404520 
451789 
455178 


RTA00002691F.e. 12.2. P.Seq 
RTAOO002687F.f.05.2.P.Seq 
RTA00002692F.b.04.2.P.Seq 
RTA00002694F.b. 19.1. P.Seq 




M000433^2D:C1 1 
M00040202A:F05 
M00042956CB06 
M00043447A:C07 


CH17COHLV 
CHUEDT 
CH18CON 

CH20COHLV 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


753 


455136 


RTA00002694F.a.08. 1 .P.Seq 


F 


M00042595A:B01 


CH20COHLV 


754 


379001 


RTA00002683F.o.02.2.P.Seq 


F 


M00040097A:C12 


CH09LNL 


755 


374763 


RTA00002673 F.p.2 1 . 1 .P.Seq 


F 


M00039118B:C05 


CH09LNL 


756 


402508 


RTA000026S6F.o.l5.l.P.Seq 


F 


M000402SID:BOI 


CHI3EDT 


757 


431370 


RTA00002669F.m.04.3. P.Seq 


F 


M00033288B:D12 


CH08LNH 


758 


380500 


RTA00002670F.p.l9.l.P.Seq 


F 


M00033583B:E06 


CH09LNL 


759 


376743 


RTA00002678F.e.22.2.P.Seq 


F 


M00039461A:F04 


CH09LNL 


760 


191690 


RTA00002673F.m. 19.1. P.Seq 


F 


M00039107C:E04 


CH09LNL 


761 


374264 


RTA0000267 1 F.p.2 1 .2.P.Seq 


F 


M00038620B:E09 


CH09LNL 


762 


373020 


RTA00002671 F.b.20.2.P.Seq 


F 


M00033595A:C1I 


CH09LNL 


763 


375231 


RTA0000267 1 F.m.20.2.P.Seq 


F 


M00038387B:A07 


CH09LNL 


764 


16130 


RTA00002709F.J. 17.1. P.Seq 


F 


M00006977D:A03 


CH02COH 


765 


379403 


RTA000026S3F.c.l7.2.P.Seq 


F 


M00040041C:C09 


CH09LNL 


766 


375382 


RTA00002677F.d.24.2.P.Seq 


F 


M00039381D:C02 


CH09LNL 


767 


379653 


RTA00002683F.c.03.2.P.Seq 


F 


M00040038D:G04 


CH09LNL 


768 


377858 


RTA0000268lF.e.l4.2,P,Seq 


F 


M00039864A:A07 


CH09LNL 


769 


430861 


RTA00002668F.h.lS.l. P.Seq 


F 


M00032995C:C05 


CHOSLNH 


770 


376128 


RTA00002677F.a. 1 1 .2. P.Seq 


F 


M00039334B:E03 


CH09LNL 


771 


375009 


RTA00002676F.n.20.2.P.Seq 


F 


M00039322A:F04 


CH09LNL 


772 


429816 


RTA00002667F.n.22. 1 .P.Seq 


F 


M00032871D:EI 1 


CHOSLNH 


773 


375657 


RTA00002681F.h.l3.2.P.Seq 


F 


M00039877OC03 


CH09LNL 


774 


427889 


RTA00002666F.b. 14.1. P.Seq 


F 


M00032530D:C02 


CHOSLNH 


775 


376761 


RTA00002677F.g.03.2.P.Seq 


F 


M0003939!D:F08 


CH09LNL 


776 


44025 


RTA00002634F.b.24.2.P.Seq 


F 


M00040I 153:A04 


CH09LNL 


777 


44025 


RTA0OOO2684F.C.0 1 .2.P.Seq 


F 


M00040I 15B:A04 


CH09LNL 


778 


392524 


RTA0000268 1 F.p.04.2.P.Seq 


F 


M00039909D:C02 


CH09LNL 


779 


427252 


RTA00002665F.b. 13. 1 .P.Seq 


F 


M00023185B:A06 


CHOSLNH 


780 


374927 


RTA00002673F.C. 12.1. P.Seq 


F 


M0003906SCE06 


CH09LNL 


781 


378226 


RTA00002680F.g.09. 1 .P.Seq 


F 


M00039797C.G05 


CH09LNL 


782 


217964 


RTA00002664F.g.08.2.P,Seq 


F 


M00027299B:B12 


CH04MAL 


783 


376368 


RTA00002677F.b. !4.2.P.Seq 


F 


M00039339A:H07 


CH09LNL 


784 


377719 


RTA00002677F.J.I 1.2. P.Seq 


F 


M00039407B:G02 


CH09LNL 


785 


378081 


RTA00002677F.e.l6.2.P.Seq 


F 


M000393S4CE02 


CH09LNL 


786 


89267. 


RTA00002662F.b.0l.2.P.Seq 


F 


M00005445D:B01 


CH02COH 


787 


374927 


RTA00002673F.e. 12.2. P.Seq 


F 


M00039068C:E06 


CH09LNL 


788 


279054 


RTA00002667F.b.23. 1. P.Seq 


F 


M00032731B:C10 


CHOSLNH 


789 


377283 


RTA00002682F.m. 19.1. P.Seq 


F 


M0O04OO16C:H12 


CH09LNL 


790 


45318 


RTA000027IOF.I.05.I. P.Seq 


F 


M00022533A:A08 


CH03MAH 


791 


188292 


RTA00002664F.e.23.2. P.Seq 


F 


M00027162B:F05 


CH04MAL 


792 


378872 


RTA00002683F.C.20.2. P.Seq 


F 


M00040042B:A10 


CH09LNL 


793 


427252 


RTA00002665F.b.l3.3.P.Seq 




M0002S1S5B:A06 


CHOSLNH 


794 


380618 


RTA000O2673F.J.12.2. P.Seq 




M00039084C:G07 


CH09LNL 


795 


35646 


RTA00002667F.2. 16.1. P.Seq 




M00032797B:G02 


CHOSLNH 


796 


46407 


RTA00O02665F.C. 10.3. P.Seq 




M00028I96D:A03 


CHOSLNH 


797 


373720 


RTA00002674F.C.04. t .P.Seq 




M00039I24C:F03 


CH09LNL 


798 


429693 


RTA0OO0:668F.f.05. 1 .P.Seq 




M00032944B:B02 


CHOSLNH 


799 


377108 


RTA00002678F.p.04.2. P.Seq 




M00039636C:DI 1 


CHO^LNL 



WO 01/02568 



PCT/US00/18374 




WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LlorvAK t 


847 


375782 


RTA00002677F.d.23.2.P.Seq 


F 


M0003938iC:H08 


CH09LNL 


848 


372958 


RTA00002672F.C.02. l.P.Seq 


F 


M00038639D:F07 


CH09LNL 


849 


403940 


RTA0O002683F.d.07.2.P.Seq 


F 


M0004038"D:H05 


CH UEDT 


850 


8490 


RTA0O00271 !F.g.03. l.P.Seq 


F 


M0002302GCG08 


CHOjMAH 


851 


374809 


RTA00002675F.3.24. l.P.Seq 


F 


M00039230D:D09 


CH09LNL 


852 


377788 


RTA00002684F.g.24.2.P.Seq 


F 


M0004030:C:H06 


CH09LNL 


853 


13847 


RTA000027I I F.f.09. l.P.Seq 


F 


M00022976C:F04 


CHOjMAH 


854 


374172 


RTA00002673 F.k. 1 6. l.P.Seq 


F 


M00039097D:D06 


/— t |Anl \f f 

CH09LNL 


855 


380314 


RTA00002682F.1.07. l.P.Seq 


F 


M00040009D:B07 


CH09LNL 


856 


47231 


RTA000027l4F.b. 15. l.P.Seq 


F 


MO0O27815CF01 


CH04ivlAL 


857 


400287 


RTA00002685F.k. 10. l.P.Seq 


F 


M0003958-C:COl 


rune r\T 


858 


400533 


RTA00002685F.a.02.2.P.Seq 


F 


M0003918ID:E05 


/-n nc pif* 
L H 1-tU I 


859 


447594 


RTA00002689F.C.07. l.P.Seq. 


F 


M000426963:E05 


In 1 ^LU(N 


860 


147357 


RTA000027UF.e.l5.l.P.Seq 


F 


M0002292SB:C01 


f LJ A ~" V ( A U 


861 


401141 


RTA000026S5F.o.22:2.P.Seq 


F 


M00039642D:B12 


CH 1 _tU I 


862 


404620 


RTA00002687F.C.03. 1 .P.Seq 


F 


M00039770A:Gl 1 


nil if" r\T 

CH l4tD I 


863 


24360 


RTA00002709F.I.20. l.P.Seq 


F 


M00007U9A:G02 


CH02COH 


864 


380618 


RTA00002673F.j. 12. l.P.Seq 


F 


M0003908-CG07 


CH09LNL 

W f t S /"* /"*\ f\ 


865 


448446 


RTA00002690F.d.09.3. P.Seq 


F 


M00042797D:D10 


CH16COP 


866 


402313 


RTA000026S6F.f.l8.!.P.Seq 


F 


M00040l7-iD:C03 


CHI jEDT 


867 


273151 


RTA00002685F.c.05.2.P.Seq 


F 


M00039374C:H02 


CH12EDT 


868 


404172 


RTA00002687F.d.l7.2.P.Seq 


F 


M0003995iB:Bl2 


CH14EDI 


869 


263630 


RTA00002694F.e. 10. l.P.Seq 


F 


M00043637C:H01 


CH20CUHL V 


870 


404277 


RTA000026S7F.d.l3.I.P.Seq 


F 


M00039951B:C03 


CH14LL) I 


871 


403557 


RTA00002687F.d. 10. l.P.Seq 


F 


M00039948A:E03 


CH14hU 1 


872 


375161 


RTA00002676F.m.24.2.P.Seq 


F 


M00039319B:H12 


z~ t mn i v(I 

CH09LNL 


873 


376829 


RTA00002674F.t.2 1 . 1 -P.Seq 


F 


M00039135D:G02 


CHQvLiNL 


874 


372953 


RTA00002672F.c.02.2.P.Seq 


F 


M0003S639D:F07 


runfli v!T 


875 


21578 


RTA00002709F.a.24. l.P.Seq 


F 


M00005351OG05 


CH02COH 


876 


402506 


RTA00002686F.b. 17. l.P.Seq 


F 


M00039760B:B08 


CH 1 jcU I 


877 


141731 


RTA000027 1 3F.b.04. l.P.Seq 


F 


M0002721ZD:E0j 


run m ( \ T 

CHU4MAL 


878 


37411 


RTA00002661F.e. 11. l.P.Seq 


F 


M0000377OA:EO5 


r t ia i r* r\\A 
CHO ICUH 


879 


372537 


RTA00002670F.c.05.2.P.Seq 


F 


M00033345 D:A09 


LHUVLML 


880 


380834 


RTA00002670F.C.08.2. P.Seq 


F 


M00033346C:A05 


/"*T_IAO.I \il 


881 


401492 


RTA00002685F.n. 1 7.2.P.Seq 


F 


M00039609D:F07 


/-tj i ^cnT 


882 


99998 


RTA00002662F.b.23.2.P.Seq 


F 


M00006712C:H09 


LnU-LUn 


883 


404311 


RTA00002688F.d.21.2.P.Seq 




M00040394.\:D04 


r i_r i irnT 


884 


231084 


RTA00002664F.c.i3.2.P.Seq 


F 


M00026918B:D01 


CH04MAL 


885 


447679 


RTA000026S9F.b.l 1.3. P.Seq 




M00042560A:F12 


CH I 2LUiN 


886 


377012 


RTA00002682F.d. 17. l.P.Seq 


■ F 


M00039936C;C05 


CH09LNL 


887 


226207 


RTA00002664t-.d.-. 1 ..i.r.aeq 




M0002703 f D:C06 


CH04MAL 


888 


446183 


RTA000026S9F.3. 1 2. 1 .P.Seq 




M00042534A:A05 


CHI SCON 


889 


428508 


RTA00002666F.C.24. 1 .P.Seq 




M00032545B:H09 


CH08LNH 


890 
891 


157648 
404609 


RTA00002714F.D.20. l.P.Seq 
RTA0OOO26SSF.b.l5.2. P.Seq 




M00027818C:C07 
M00040377C:G07 


CH04MAL 
CH14EDT 


892 
893 


400464 
379108 


RTA000026S5F.1.10. l.P.Seq 
RTA000026S5F.1. 12. l.P.Seq 




M00039.^0D:D02 
M0003959iC:D06 


CH12EDT 
j CHI2EDT 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME < 


DRJENTATION 
F 


CLONE ID 
M000392S-iU>:bl-: 


LIBRARY 
CH09LNL 


894 
895 


374639 
380674 


RTAQ0002676F.d.2l.2.P.Seq 
o*r a finnfl^fiT^F i 14 "> P Sea 


F 


M00039084C:H04 


CH09LNL 


896 


380674 


RTA00002673F.j.U. l.P.Seq 


F 


M00039084C:H04 


CH09LNL 


897 


188972 


RTA00002664F.d.20.2.P.Seq 


F 


M00027030C:H06 


CH04MAL 


898 


402835 


RTA00002686F.C.0 1. l.P.Seq 


F 


M00040131D:G08 


CH IjfcD i 


899 


403774 


RTA00002687F.d.08.2.P.Seq 


F 
F 


M00039947C:G03 
M00039096A:A05 


CH 14bU I 
CH09LNL 


900 
901 


374606 
192535 


RTA00002673F.j.23.2.P.Seq 
RTA00002663F.m. 14. l.P.Seq 


F 
F 


M00022925CA08 
M00039820B:B06 


CHUjMAn 

pttAQI VII 


902 
903 


377926 
186055 


RTA00002680F.U6.2.P.Seq 
RTA000027l2F.i.ll.l.P.Scq 


F 


M00026926A:E10 


LHU4ivlAL 


904 


380498 


RTA00002684F.f.U.2.P.Seq 


F 


M00040I29D:E10 


mriQi MI 
LnUVLiNL 


905 


400236 


RTA00002685F.U8.2.P.Seq 


F 


M00039561 A:B07 


/— ti 1 TCnT 

CHl-tD I 


906 


401070 


RTA00002638F.d.l2.2.P.Seq. 


F 


M0004039OA:H02 


LJ 1 1 C VYT 


907 


452622 


RTA00002692F.b. 1 4.2.P.Seq 


F 


M00042962D:C05 


CHI oLUiN 


908 


235052 


RTA00002692F.a.l5.1.P.Seq 


F 


M00042626B:D08 


CHI SCON 


909 


452221 




F 


M00042986C:G12 


CH18CON 


910 


404581 


RTA00002687F.2. 11 ,2.P.Seq 


F 


M00040208D:G09 


CH14EDT 


911 


376925 


RTA00002687F.e.i4.2.P.Seq 


F 


M00039957C:C09 


CHI4EDT 


912 


400287 


RTA0O0O2685F.k.lO.2.P.Seq 


F 


M000395S4C:C0l 


CH12EDT 


913 


403242 


RTA00002687F.I.05.2.P.Seq 


F 


M00040323B:CI2 


CH14EU 1 


914 
915 


453313 
452633 


RTA00002693F.a.07.2.P.Seq 
RTA00002692F.f.ll.2.P.Seq 


F 
F 
F 


M000426I4B:B05 
M00043067D-.DIO 
M0004256OA:F12 


CH19COP 
CHI SCON 
CH15CON 


916 
917 
918 
919 


447679 
452398 
449797 
403916 


RTA000026S9F.b.l U.P.Seq 
n-r * nnnm/iQTP f i *7 i p Sea 

RTA0000269lF.b.22.3.P.Seq 
RTA00002687F.j.ll.2.P.Seq 


F 
F 
F 


M00043I25C:AII 
M0004333-tB:A10 
M00040314D:H05 


CHI SCON 
CH17COHLV 
CH14EDT 


920 
921 
922 
923 


236906 
404161 
386110 
451512 


RTA00002693F.d.05.2.P.Seq 
R I AUUUU.Oo i r .c._u._.r.oc^ 
RTA00002687F.e.06.:.PSeq 
RTA00002691F.b.02.3.P.Seq 


F 
F 
F 
F 
F 


M00043154A:B07 
M00039958C:B09 
M00039955CC04 
M00O433O5B:G02 
M00040320D:F02 


CH19COP 
CH14EDT 
CH14EDT 
L H 1 /CUnL v 

CH14EDT 


924 
925 
926 
927 


400517 
403578 
403578 
403371 


RTA00002687F.k.l5.2.P.Seq 
RTA00002687F.i.0l.2.P.Seq 
RTA00002687F.h.24.2.P.Seq 
RTA00002687F.h.l9.2.P.Seq 


F 
F 
F 
F 


■M00040296D:E09 
M00040296D:E09 
M00040294D:D12 
M00043125A:BI 1 


L n 14LU l 
CH14EDT 
CHUEDT 

L. n 1 ov.VJi > ' 


928 
929 


452531 
454453 


RTA00002692F.f. 16. l.P.Seq 
RTA00002693F.f.l5.2.P.Seq 


F 


M00043215A:D02 


CH19COP 


930 
931 


238270 
14583 


RTA00002692F.e.07.2.P.Seq 
RTA000026S7F.f.08.2.P.Seq 


F 
F 


M0004302SA:G05 
M00040203B:A05 


CHI SCON 

/"Ml 1 C I~\T 

CH 14tL) I 


932 
933 
934 


400464 
404642 
380413 


RTA00002685F.I.10.2.P.Seq 
RTA00002687F.f.02.2.P.Seq 
RTAOOUO-oiSUr.K. I v. i .r.oeq 


F 
F 
F 


M00039590D:D02 
M00040201C:GII 
M000398I6C:D05 


CH12EDT 
CHUEDT 
CH09LNL 


935 
936 
937 


287963 
20847 
456531 


RTA00002693F.c.20.2.P.Seq 
RTA000027l0F.d.09. l.P.Seq 
RTA00002694F.b. 18. l.P.Seq 


F 
F 
F 


M00043I48C:A09 
M0002I852D:A05 
M000J3446C:E12 


CH19COP 
CH03MAH 
CH20COHLV 


938 
939 
940 


4:0463 
456713 
455508 


RTA00002694F.3. 12. l.P.Seq 
RTA00002694F.d.l3.1.P.Seq 
RTA00002694F.a.l5.1.P.Seq 


F 
F 
F 


M0O042596C:D07 
M00043513D:G08 
M000J25»37B:E12 


CH20COHLV 
CH20COHLV 
CH20COHLV 



WO 01/02568 



PCT/US00/18374 



\U 


f*\ ! ICTCp 

ULUj I C.K 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


941 


j7ot jo 


pTAnfiftO n n74F m 0v2.P.Seq 


F ' 


M00039169A:E12 


CH09LNL 


942 


402oj 1 


pTAnnftO"'686F m 03J.P.Seq 


F 


M00040264D:G05 


CH13EDT 


943 


373820 


DTAnnnn*>rt74F d 06 1 P Sea 


F 


M00039127A:G11 


CH09LNL 


944 


85388 


pTAnnnrPn74F c 06 7 P Sea 


F 


M00039124CH08 


CH09LNL 


94 d 


4007 j2 


pTAnnnfPn85F k ? 4 2. P. Sea 


F 


M0005958"C:F12 


CH12EDT 


946 


43 1629 


pTAnnnn^fi^QF 1 14 I PSea 


F 


M000332768:G08 


CH08LNH 


947 


449349 


pTAnnnrnrtQOF d P 3 P Sea 


F 


M00042802CC04 


CH16COP 


948 


40 11 24 


oTAnnnn^ftRSF o 1 1 "> P Sea 


F 


M00039629D:B04 


CH12EDT 


949 


.1 C "» H " 


pTAnnno^69jF a 01 "'.PSea 


F 


M0004261 1A:A06 


CH19COP 


950 


I24is I j 


PTAnnno'JfiSSF i 10. 2. P. Sea 


F 


M00039564B:C0l 


CH12EDT 


95 1 


4j462/ 


DTAnnno^fiQiF f 09 "* P Sea 


F 


M00043210CE05 


CH19COP 


952 


1 69464 


DTAnnnn^fi63F i 19 1 P Sea 


F 


M00022602A:E09 


CH03MAH 


953 


451654 


DTAnnno^fiQ^F f 0"* ** P Sea 


F 


M00043044D:A09 


CHACON 


954 


406092 


DTAnfififPfi85F k 1 1 ^ P. Sea 


F 


M00039584C:CI1 


CH12EDT 


. 955 


453501 


DTAnnnn*'^Q"iFd 14 1 PSea 


F 


M00043162D:C12 


CH19COP 


956 


450845 


DTAnnnn^AQ l F f 1 0 1 P Sea 

K 1 AUUUU-07 ir.i.iu.i.r - 


F 


M00043410CA09 


CHI7COHLV 


957 


448 1 77 


dt a nnnfp AQOF <* P 1 PSea 


F 


M00042839B:Bil 


CHI6COP 


958 


402617 


RTA00002686F.b.21.LP.Seq 


F 


M00040I31B.D1I 


CH13EDT 


959 


37S014 


RTA000026S0F.gJ7.LP.Seq 


F 


M00039799A:D10 


LHU9L.NL 


960 


124813 


RTA000026S5F.jJOJ.P.Seq 


F 


M00039564B:C01 


CH12EDT 


961 


29450 


RTA0O002663F.d.O7. 1 .P.Seq 


F 


M00022054A:H03 




962 


400486 




F 


M000394963.D08 


CHI2EDT 


963 


44753 


dt a nnnfP7 1 f 0^ I P Sea 


F 


M00027324D:C05 


CH04MAL 


964 


448 1 77 


dta nnnn^^oOF •» P P Sea 


F 


M00042839B:Bl 1 


CH16COP 


965 


447697 


dt \ rifififPASQF ? 1 S 3 P Sea 
dt \ nnnm rl 04 1 PSea 


F 
F 


M00042905A:F1 1 
M00039947A:D06 


CHI SCON 
CH14EDT 


966 
967 


240318 
45 1620 


R I AUUUU-Oo / r .u.u**. i .r.jcL 
DTAnnnfPAQl F d "'O 3 P Sea 
o*r vrinnmAQ-NP i "*0 1 P Sen 


F 
F 


M00043379D:H02 
M0003956!3:A09 


CH17COHLV 
CHI2EDT 


963 
969 


400157 
400276 


R I AUU'JU-Oo} r .1 — u.-.r. jc^ 

pTAnnnrPrtSSF h 16 P Sea 


F 


M0003952SB:312 


CHI2EDT 


970 


449779 


pTAnnf»0"*69i F d 04.3. P. Sea 


F 


M00043367B:A08 


CH17COHLV 


971 


40U 1 3 / 


RTAfiO00"'685F i 20.1.P.Seq 


F 


M00039561B:A09 


CH12EDT 


972 


238133 


dta nnfifP68^F e 03 1 P Sea 


F 


M000394963:H09 


CH12EDT 


973 


452015 


pTAnnofpfitPF c 07 2.P Sea 


F 


M000429SIB:DU 


CH18CON 


974 


4007j2 


RT a 0000^68^ F 1 01.2 P Sea 


F 


M00039587C.F12 


CH12EDT 


975 


24984 


pTAnnnn''7i IF d 1 LP. Sea 


F 


M00022910A:A06 


CH03MAH 


976 


449040 


RTA00002690F.eJ4.2.P.Seq 


F 


M0004284ID:H07 


CH16COP 


977 


37748 1 


pt AOOno"*67 1 F i 1 5 3 P Sea 


F 


M00038303A:C03 


CH09LNL 


978 


400910 


RTA00002685F.b.07J.P.Seq 


F 


M00039367B:H02 


CH12EDT 


979 


376945 


pTAnnnn^nS^F k "*3 1 PSea 


F 


M00040007D:A06 


CH0 Q LNL 


980 


15906 


RTA00002709F.eJ4.LP.Seq 


F 


M00005805D-.D12 


CH02COH 


981 


452781 


RTA00002692F.bJ6.2.P.Seq 


F 


M00042966B:F07 


CH loLUiN 


982 


415294 


RTA00002686F.f.l4.I.P.Seq 


F 


M00040173D:B05 


CH13EDT 


983 
984 


401644 
404402 


RTA00002685F.nJ6.LP.Seq 
RTA00002687F.aJ9.2.P.Seq 


F 
F 


M00039608D:H01 
M00039761D-.E10 


CH12EDT 
CH14EDT 


985 
986 


40 1 709 
401644 


RTA000026S5F.n.24.2.P.Seq 
RTA00002685F.n.l6.2.P.Seq 


F 
F 


M00039624A:H09 
M00039608D.H01 


CH12EDT 
CH12EDT 

chisconI 


987 


452531 


RTA00002692F.f.l6.2.P.Seq 


F 


M00043125A:311 





WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATIO 


J CLONE ID 


LIBRARY 


988 


400910 


RTA00002685F.b.07.2.P.Seq 


p 


M00O39367B:H02 


CH12EDT 


989 


449235 


RTA00002690F.a.22.3.P.Seq 


p 


M00042439B:B03 


CH16COP 


990 


449794 


RTA0000269lF.c.22.2.P.Seq 


p 


M0004336IB:AOI 


CHI 7COHLV 


991 


40092 1 


RTA00002685F.b.I8.I.P.Seq 


p 


M0003937!B:H06 


CHI2EDT 


992 


373874 


RTA00002672F.c.22.2.P.Seq 


p 


M00O38663D:H10 


CH09LNL 


993 


401050 


RTA00002685F.e.09.2.P.Seq 


p 


M00039499C:A04 


CH12EDT 


994 


453237 


RTA00002693F.c.02.2.P.Seq 


p 


M00043I08A:F06 


CH19COP 


995 


449294 


RTA00002690F.C. 1 3.3. P.Seq 


p 


M00042770CC04 


CH16COP 


996 


404260 


RTA00002687F.C. I U .P.Seq 


p 


M00039942D:C01 


CH14EDT 


997 


378014 


RTA00002680F.g. 1 7.2.P.Seq 


p 


M00039799A:DIO 


CH09LNL 


998 


404726 


RTA00002688F.a.l8.2.P.Seq 


p 


M00040371C:H05 


CH14EDT 


999 


451347 


RTA00002691F.b.l 1.3-P.Seq 


p 


M000433I 1CE03 


CH17COHLV 


1000 


401154 


RTA00002685F.e.06.2.P.Seq 


p 


M00039497C;C06 


CH12EDT 


1001 


401870 


RTA00002686F.b.22. 1 .P.Seq 


p 


M00040I31C:F03 


CH13EDT 


1002 


400170 


RTA00002685F.b.03.2.P.Seq 


p 


M00039366C:B07 


CH12EDT 


1003 


25387 


RTA00O027ilF.f !9.].p.Seq 


p 


M0002300IC:C08 


CH03MAH 


1004 


377085 


RTA00002678F.n. 14.1. P.Seq 


p 


M000396I9B:D02 


CH09LNL 


1005 


403530 


RTA00002683F.a.09.2.P.Seq 


p 


M0004036SA:F01 


CHI4EDT 


1006 


372930 


RTA00002670Fj.!2.2.P.Seq 


p 


M0003343 T C:A07 


CH09LNL 


1007 


401120 


RTA00002685F.c.23.2.P.Seq 


p 


M00039379A-.BO3 


CH12EDT 


1008 


403397 


RTA00002687F.h.02.2.P.Seq 


p 


M000402I9B:D02 


CHUEDT 


1009 


449337 


RTA00002690F.C. 18.3. P.Seq 


p 


M00042774C:C03 


CH16COP 


1010 


403561 


RTA00002688F.d.06.2.P.Seq 


p 


M000403S~C:E07 


CHUEDT 


ton 


134132 


RTA00002692F.d. I3.2.P.Seq 


F 


M0004301 IA:H12 


CHI SCON 


1012 


377085 


RTA00002678F.n.l4.2.P.Seq 


p 


M00039619B:D02 


CH09LNL 


1013 


376138 


RTA00002674F.m.05. 1 .P.Seq 


p 


M00039169A:EI2 


CH09LNL 


1014 


401154 


RTA00002685F.e.06.l. P.Seq 


p 


M0003949 _ C:C06 


CH12EDT 


1015 


449825 


RTA0000269IF.b. 14.3. P.Seq 


F 


M00043320B:A07 


CH17COHLV 


1016 


403896 


RTA00"0O2687F.a.04.2.P.Seq 


p 


M00039746CH05 


CHUEDT 


1017 


377632 


RTA00002683F.US.2.P.Seq 


F 


M000400S _ D:F08 


CH09LNL 


1013 


450845 


RTA0000269IF.f.l0.2.P.Seq 


p 


M00043410CA09 


CH17COHLV 


1019 


450045 


RTA0000269iF.e. 10.2. P.Seq 


p 


M0004339!A:CIO 


CH17COHLV 


1020 


402962 


RTA00002686F.d.22. 1 .P.Seq 


p 


M00040147D:HI I 


CH13EDT 


1021 


427674 


RTA00002665F.U 0.1. P.Seq 


p 


M00028775D:F03 


CH0SLNH 


1022 


403252 


RTA00002688F.c.I5.2.P.Seq 


p 


M00040383D:C04 


CHUEDT 


1023 


452038 


RTA00002692F.a.09. 1 .P.Seq 


p 


M00042623D:D07 


CH I SCON 


1024 


401553 


RTA0OO02685F.d.08.2.P.Seq 


p 


M000394823:G02 


CH12EDT 


1025 


451092 


RTA0000269IF.d.I7.3.P.Seq 


p 


M0004337~A:C03 


CH17COHLV 


1026 


403978 


RTA0000:687F.g.09.2.P.Seq 


F 


M000402083:A07 


CHUEDT 


1027 


377186 


RTA00002682F.rn.07. 1. P.Seq 




M000400I4D:F03 


CH09LNL 


1028 


404679 


RTA00002687F.f.07.2.P.Seq 




M00040203A:H06 


CHUEDT 


1029 


373875 


RTA00002674F.C.05.1. P.Seq 




M00039I24C:H02 


CH09LNL 


1030 


128841 


RTA00002685F.O.1 5.2. P.Seq 




M00039630C:H04 


CH12EDT 


1031 


33971 


RTA000027I3F.h.l3.1.P.Seq 




M000273 l ):3:H02 


CH04MAL 


1032 


332878 


RTA00002666F.h. 13. 1 .P.Seq 




M000325"3"C:B0I 


CH0SLNH 


1033 


400781 


RTA0OOO26S5F.j.03.2.P.Seq 




M000395o:3:C02 


CH12EDT 


1034 


456456 


RTA00002694F.b.22.l. P.Seq 




M0004344OA:E12 


CH20COKLV 



WO 01/02568 
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SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


1035 


402337 


RTA000026S6F.1.07.1. P.Seq 


F 


M00040257D:H10 


CH13EDT 


1036 


401974 


RTA000026S6F.U 5,1. P.Seq 


F 


M00040223A:C05 


CH13EDT 


1037 


455141 


RTA00002694F.b. 14. 1 .P.Seq 


F 


M00043440CB07 


CH20COHLV 


1038 


402057 


RTA00002686F.1. 14.1. P.Seq 


F 


M00040260C:D04 


CH13EDT 


1039 


402555 


RTA0O0O2686F.m. 14. 1 .P.Seq 


F 


M00040267CC04 


CH13EDT 


1040 


406092 


RTA00002685F.k.i 1.1. P.Seq 


F 


M00039584C.C1 1 


CH12EDT 


1041 


374351 


RTA00002674F.L20.1. P.Seq 


F 


M00039147A:FIO 


CH09LNL 


1042 


402365 


RTA00002686F.j.08. 1 .P.Seq 


F 


M00040230A:H02 


CH 1 jEDT 


1043 


401823 


RTA00002686F.j.l4.LP.Seq 


F 


M00040232D:B07 


CH13EDT 


1044 


447669 


RTA00002689F.a. 1 5.2.P.Seq 


F 


M000425_'iB:E06 


CH nCON 


1045 


402588 


RTA00002686F.k. 13.1 .P.Seq 


F 


M00040254B:CIO 


CH 13EDT 


1046 


244858 


RTA000O2686F.1.02.1. P.Seq 


F 


M00040256A:A06 


/"* T I 1 "* r~ T~"\T" 

CH 1 jEDT 


1047 


402339 


RTA00002686F.L20.1. P.Seq 


F 


M00040226A:H10 


/"iinr i"\T 

CHI j>tu\ 


1043 


401766 


RTA00002686F.o.l6.l.P.Seq 


F 


M00040282A:A03 


CH 13EDT 


.1049 


402952 


RTA00002686F.g. 1 4. 1. P.Seq 


F 


M00040181D:H10 


CH 1 jEDT 


1050 


449669 


RTA00002690F.C. 10.3. P.Seq 


F 


M00042767B:G10 


CHloCOr 


1051 


400520 


RTA00002685F.g.04.2.P.Seq 


F 


MO00395I2CDO6 


CH 12EDT 


1052 


403863 


RTA00002687F.k.05 . 1 .P.Seq 


F 


M00040318C:H1 1 


CH I4EDT 


1053 


403242 


RTA0000:687F.1.05.l.P.Seq 


F 


M00040323B:C12 


PHI i r r\ r 

CH 14EDL 


1054 


402182 


RTA00002686F.f.l6.I.P.Seq 


F 


M00040174C:E10 


CHI jEDT 


1055 


449269 


RTA00002690F.C. 12.3.P.Seq 


F 


M00042770B:B12 


CH16COP 


1056 


401290 


RTA00002685F.n. 10.1. P.Seq 


F 


M00039606B:D03 


CH12EDT 


1057 


443420 


RTA00002690F.d.07.3. P.Seq 


F 


M00042790CC07 


CHI6COP 


1058 


374351 


RTA00002674F.L20.2. P.Seq 


F 


M00039147A:FiO 


CH09LNL 


1059 


448464 


RTA00002690F.c.08.3.P.Seq 


F 


M00042765C:D04 


CH16COP 


1060 


401079 


RTA00002685F.p.05.2.P.Seq 


F 


M00039643CB04 


CH 12EDT 


1061 


403916 


RTAO0OO2687F.J. 11.1. P.Seq 


F 


M00040314D:H05 


CH 14ED I 


1062 


401374 


RTA00CO2685F.p.07.2.P.Sec 


F 


M00039645C:E01 


CH I2bD I 


1063 


400503 


RTA00002685F.k.02. 1 .P.Seq 


F 


M00039570B:D10 


pi i n cnT 
CH i- tu i 


1064 


219825 


RTA00002664F.h.06.2.P.Seq 


F 


M000273%D:G08 


CHU4t\lAL 


1065 


377732 


RTA0000268 1 F.p.09.2.P.Seq 


F 


M00039910C:G10 


CHU9LNL 


1066 


380348 


RTA00002684F.d.l2.l.P.Seq 


F 


M00040121 3:C05 


C HUVL.^L 


1067 


449549 


RTA00002690F.a.09.3.P.Seq 


F 


M0004243 1C:F01 


CH loCUr 


1068 


402223 


RTA00002686F.f.05. 1 .P.Seq 


F 


M00040 169B:F08 




1069 


401727 


RTA00002685F.o.23.2.P.Seq 


F 


M00039642D:H09 


CH UbU I 


1070 


379878 


RTA000026S2F.h. 12.1. P.Seq 


F 


M000399S4A:C02 


CHOVLNL 


1071 


378602 


RTA00C02681F.a.08.2.P.Seq 


F 


M00039839C:E05 


CHU^LM 


1072 


448065 


RTA00002690F.C.22.3. P.Seq 


F 


M0004273 1 A:A07 


CH loCUr 


1073 


403493 


RTA00002687F j.03. 1 .P.Seq 




M000403 1 3D:E04 


CM 1 4tU t 


1074 


400517 


R I AUUlU-OiS / r .K. i - - 1 r.ieq 


F 


M00040320D:F02 


CHUEDT 


1075 


456636 


RTA00002694F.e.05. 1 .P.Seq 




M00043632D:F09 


CH20COHLV 


1076 


40010! 


RTA00002685F.O.04. 1 .P.Seq 




M00039625B:G03 


CHI2EDT 


1077 


403578 


RTAO0CO26S7F.L0 1.1. P.Seq 




M00040296D:£09 


CHUEDT 


1078 


402419 


RTA00i:02636F.g.20.1. P.Seq 




M00040IS-iC:Al ! 


CH13EDT 


1079 


375161 


RTAOOC02676F.n.01.:. P.Seq 




M000393tPB:H12 


CH09LNL 


1080 


401851 


RTA00e02636F.d.0T.l. P.Seq 




M00040143.V.H05 


CH13EDT 


1081 


400567 


RTA00C02685F.a.l4.2. P.Seq 




M00039361B:E01 


CHI2EDT 



WO 01/02568 



PCT/US00/18374 



CCA 

in 




SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


\\Jol 


J I 004 I 


aTAonnn°677F d 01 P Sea 




M00039345 A:D09 


CH09LNL 


i aoi 


J /004 i 


RTAOOO0">677F c n 4 ^ P Seq 


F 


M0003934;A:D09 


CH09LNL 


IU54 


JAAti^n. 


RTAOOOO' 1 685F i.22.l P.Seq 


p 


M0003v570A:D10 


CH12EDT 


i rtff < 


J / JJ / J 


RTAnn00'»676F h P 1 PSeq 


P 


M00039300C:C09 


CH09LNL 


lUOO 


17^171 
J / JJ / J 


RTA0000" , 676F h 12 2 P.Seq 


F 


M00039300C:C09 


CH09LNL 


lUo / 


A 1 16A1 
*+ I J 0*0 


RTA000C685F n 05.2. P Seq 


" " F 


M00039604D:G03 


CH12EDT 


t ABQ 


44 OO /4 


RTAnnnn">6Q0F c O'' 3 P Sea 


■ - p 


M00042759B:GU 


CH16COP 


IUoV 


J /OD 1 1 


RTAfinnO'*fi74F h 04 I P Sea 


F 

... 


M00039I40A:B08 


CH09LNL 


1 AAA 


17 IAjIA 


RTAonno">fi74F h *> 1 1 P Sea 




M0003914ZD:B1 1 


CH09LNL 


1 AO 1 


/t <A 1 17 

4M I 


rt Annfin769jF e IS 1 P Sea 


F 


M0004319!A:A07 


CH19COP 


i Am 


,1 A/1 < 0 1 

4U4j5 1 


RTAnnnn^fi^F » 1 1 1 P Sea 


F 


M0004020SD:G09 


CH14EDT 




7AAO 1 


RTAnn0fl''fi89F c 13 1 P Sea 


F 


M0004270:3:G02 


CHI5CON 


iuy4 


J /Vj04 


RTAfioon" , 687F o P 1 P Sea 




M00040346A:CI 1 


CH14EDT 




4J.4V 1 


RTAO000''69* , F f 0^ 2 P Seq 


- -'p 1 


M00043046D:B1 1 


CHI SCON 


, IUV0 


4Uj J4 1 


RTAonoo' , fi87F d ^0 ^ P Sea 


- "p 


M0004036-A:E05 


CH14EDT 


1 AQ7 

1097 


A A/1 At A 


RTAP,nf)fPn88F b 1 1 ^ P Sea 


F 


M00040376C:G02 


CH14EDT 


1 AOO 

IU9o 


j /Vjo4 


RTAOfton" , o87F o P "* P Sea 


■ p 


M0004034cA:CI 1 


CH14EDT 


i Ann 


4j I D45 


RTAfifinrPnQ l F b 0 Q 3 P Sea 


F - 


M0004331CC:G06 


CH17COHLV 


1 100 


454308 


RTAnnnn"*ftO^F t* 14 ! P Sea 




M000432 13 3:812 


CHI9COP 


1 101 


40 1 1 84 


DTAAnnn7A9sP A 04 1 P Sea 


= 


M0003938CC:C09 


CH12EDT 


1 102 


40 1 290 


bta Annn^AssF n 10 "* P Sea 

K 1 rtUUUU^Oo Jr .n. ( vj-— .r. 


F 




M000396063:D08 


CHI2EDT 


1 103 


A A A 1 A 1 

400 1 0 1 


DTAnnnn7A3^F o 04 "* P Sea 




M0003962: 3:G0S 


CH12EDT 


1 104 


454j0o 


pTAOonn^ftO^F f 14 "* P Sea 




M00043213 3:B12 


CH19COP 


1 105 


A CT£T1 

45ioJJ 


RTAOonn^AQ^F h 14 1 P Sea 


f 


M0004296ZD:C05 


CHI SCON 


1 I A<£ 

1 lOo 


A <AA 1 7 


rtaoooo^AQI F d 0 ft 3 P Sea 


F 


M000433703:C08 


CH17C0HLV 


1 1 A7 
1 107 


40U5UJ 


RTAnoon^n8"SF k 0* 1 1 P Sea 


F — 


M000395703:D10 


CH12EDT 


1 1 AS 


JAAJCA 


rtaoooO' ) 68'>F i *> p Sea 


F 


M00039570A:DIO 


CH12EDT 


1 1 AO 

1 1UV 


440 I 00 


RTAOOOO^fiSQF c l~ I P Seq 


" " p 


M000427i;3:All 


CH15CON 


1 1 1 A 
1 1 IU 


4jo_ J J 


rta 0000^694 F e OS 1 P Sea 


" p 


M0004363c3:C06 


CH20COHLV 


1111 
1111 


7^ Ml 
ij44j 


RTAonoo" , 7lOF d 1* 1 P Sea 


p- 1 


M0002186cD:A03 


CH03MAH 


1 1 t 7 
1 1 1 1 


A(\A 1 1 Q 

4U4 1 IV 


RTAnnoo^fiSSF d P 1 P Sea 




M00040392C:B12 


CHI4EDT 


111'' 

lib 


4Uj04i 


RTAnnnn* , *i < iS7F d 0 1 IP Sea 


"p" 


M0003994:'C:F0° 


CH14EDT 


I I 14 


A (\1AQ1 

4UJ4VJ 


rtaoooo" ) 6S7F i 0" "* P Sea 


p 


M00040313D:E04 


CHI4EDT 


IMC 
1 1 1 J 


4< A 1 17 
4j4 I JI 


RTA00n0^693F e 18 1 P Seq 


- . F " 


M0004319iA:A07 


CH19COP 


1 t 1 C 

I 1 lo 


,1 CAAA7 


pTAnonn"*^9 1 F d 1 ? 3 P Sea 


F 


M00043372C:G05 


CH17COHLV 


I 1 I 7 
III/ 


j< 1 7 1 J? 


RTA0000" , 69" , F e ^4 2 P.Seq 


F 


M00043044 3:A12 


CH18CON 


1 1 1 Q 
1110 


d^QA.7 
*4 J JVU / 


RTAOOOO^e^'F b. OS. 2. P. Seq 


p 


M0004308"3:G07 


CH19COP 


1 1 1 0 
1 1 IV 


■tLd^AAQ 


RT A0000 n 689F a. 1 ^.3. P.Seq 


p 


M0004253S3:E06 


CH15CON 


1 1 7f) 


A(\A()AA 


RTA00002687F.p. 1 1 . 1 .P.Seq 


p 


M0004035iD:Al! 


CH14EDT 


1121 


449617 


RTA00002690F.e. 1 6. 2. P.Seq 


F 


M00042S49D:F1I 


CHI6COP 


1122 


452723 


RTA00002692F.e.l8.2.P.Seq 




M00043036C:E05 


CHI SCON 


1123 


270014 


RTA000026S5F.i. 1 5. 2. P.Seq 




M0003^536C:H11 


CH12EDT 


1124 


401 198 


RTA00002685F.L 14.2. P.Seq 




M0003^536C;CIO 


CH12EDT 


1125 


452414 


RTA00002692F.e.i:.l. P.Seq 




i\10004303ZC:A10 


CHIS'CON 


1126 


453019 


RTA00002692F.d. 1 S. 2. P.Seq 




M00043015A:H10 


CHI8CON 


1127 


403642 


RTA00002687F.C.24.I. P.Seq 




M00039945C:F09 


CH14EDT 


1128 


401437 


RTA00002685F.c.lS.:.P.Seq 


F 


M0003937"D:E12 


CHI2EDT 



WO 01/02568 



PCT/USOO/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


1 129 


452414 


RTA00002692F.e. 1 2.2.P.Seq 




M00043032CA1U 


CH18CON 


1 130 


404122 


RTA00002687F.n. 1 0. 1 .P.Seq 


p 


M00040334D:B02 


CH14EDT 


Mil 


400567 


RTA00002685F.a.l4. LP.Seq 




M00O3936lB:EOl 


CH12EDT 


1 1 17 
1 1 j— 


401417 


RTA00002685F.C. 1 8. i .P.Seq 


F 


M00039377D:E12 


CHI2EDT 


1 1 j j 


404647 


RTA00002687F.f.02. 1 .P.Seq 


F 


M00040201CGII 


CH14EDT 


1 1 14 


176007 


RTA00002676F.f.22.2.P.Seq 


F 


M00039293B:C11 


CH09LNL 


I 1 IS 
l l J J 


407A1S 


RTA00002686F.b.24. 1 .P.Seq 


F 


M00040I31D:G08 


CH13EDT 


1 1 T6 


401774 


RTA00002687F.d.08. 1 .P.Seq 


F 


M00039947CG03 


CH14EDT 


1 117 


45505 


RTA000027 1 2F.d.04. 1 .P.Seq 


F 


M00023377B:F01 


CH04iVlAL 


1 1 1R 

1 1 JO 


452071 


RTA00002692F.c.05.2.P.Seq 


F 


M00042979B:E02 


CHI SCON 


1 1 ;g 

1 l J7 


44933*> 


RTA0000269 1 F.e. 13.1 .P.Seq 


F 


M00043393A:B08 


CH17COHLV 


1 140 


379004 


RTA00002683 F.n.09.2.P.Seq 


F 


M00040093B:C02 


CH09LNL 


1 141 


45521 1 


RTA00002694F.D.07. 1 .P.Seq 


F 


M00043430B:C02 


CH20COHLV 


1 147 


37902 1 


RTA00002683 F.n. 1 3 .2. P.Seq 


F 


M00040093D:D03 


CH09LNL 


1 141 


176*>79 

J / U» / 7 


RTA00002680F.d. 1 O.Z.P.Seq 


F 


M00039785D:G05 


CH09LNL 


1 144 


174171 

J /HJ / J 


RTA0000268 1 F.n.2 1 . 1 .P.Seq 


F 


M00039903A:H07 


CH09LNL 


1 14S 


Q766X 

7 / DUO 


RTA00002686F.d. 1 9. 1 .P.Seq 


p 


M00040145D:D03 


CH13EDT 


I 1*40 


J.0OJ07 


RTA00002685F.a,05.2,P.Seq 


F 


M00039tS4A:D03 


CH12EDT 


1 147 


407Q04 


RTA0000°6S6F n. 1 5. 1 .P.Seq 


F 


M00040274A:HI1 


CH13EDT 


1 149 


401QI 7 


RTA00002687F.j. 19.1 .P.Seq 


F 


M000403I7A:H03 


CH14EDT 


1 14Q 

1 l*+7 


400 s 1 1 


RTA00002685F.D.23.2. P.Seq 


F 


M00039372C:D12 


CH12EDT 


1 1 so 

1 1 Jvl 


407746 


RTA00002686F.a. 1 4. 1 .P.Seq 


p 


M00039740B:F10 


CH13EDT 


1 I J 1 


401X49 


RTA00002687F.n.09.2. P.Seq 


F 


M00040333D:G05 


CH14EDT 


1 1 J-i 


401 47 1 


RTA00002685 F.o, 1 0. LP.Seq 


p 


M00039629B:F0l 


CH12EDT 


1 1 Si 

1 1 J J 


404362 


RTA00002687F.O.06.2. P.Seq 


F 


M00040342B:D12 


CH14EDT 


1 1 S4 
I I J*+ 


171641 

J / JU*+ 1 


RTA00002677F.i.09.2.P.Seq 


F 


M00039403A:G12 


CH09LNL 


I I J J 


HI/ l 7Ji 


RTA0000^686F i. 10. 1. P.Seq 


F 


M00040231B:C08 


CH13EDT 


1 l JO 


4006 SS 


RTA00002685F.m. 09.2. P.Seq 


F 


M0003P5^ T D:F04 


CH12EDT 


1 1 S7 
i i j / 


407 ££Q 


RTA00002686F.n.05. 1 P.Seq 


p 


M00040271B:E12 


CH13EDT 


1 1 JO 


18046"* 


RTA00002670F.O.0 I .2. P.Seq 


F 


M00033570B:E06 


CH09LNL 


1 1 SQ 

1 1 J 7 


400078 

HUUV * O 


RTA00002685F.m. 1 5.2. P.Seq 


F 


M00039600A:A1 1 


CH12EDT 


1 160 


171748 

J f J i *+o 


RTA0000267IF.1.06.3.P.Seq 


p 


MOO038325D:Fl2 


CH09LNL 


1 ]6i 


401392 


RTA00002685F.f.08.2. P.Seq 


p 


M00Q39505C:E03 


CH12EDT 


1 162 


20548 


RTA000027 1 OF.h. 1 5. t .P.Seq 


F 


M0002224 7 A:E02 


CH03MAH 


1 163 


376279 


RTA00002680F.d. 1 0. 1 .P.Seq 


F 


M000397S5D:G05 


CH09LNL 


1 164 


3744°8 


RTA00002672F.3.20. 1 .P.Seq 


F 


M00038633B.G02 


CH09LNL 


1 16> 


57440c 


RTA00002672F.a.20.2. P.Seq 


F 


M0003S633B:G02 


CH09LNL 


1 166 
i i yu 


177Q14 

j / — 7 1 *t 


RTA00002679F.J.2 1 . 1. P.Seq 


p 


M0003969dA:E05 


CH09LNL 


1 167 
1 1 u / 


178170 

J / O J — u 


RTA0000268 1 F.L 14.2. P.Seq 


F 


M00039894C:H07 


CH09LNL 


it68 


235422 


RTA00O02665F.h.l9. LP.Seq 


F 


M0002S76SGD05 


CH0SLNH 


1169 


402473 


RTA0000:686F.p. 11.1. P.Seq 




M000402S"C:B09 


CHI 3EDT 


1170 


374828 


RTA00002674F.rn.IO.LP.Seq 




M0003<M70A:BI0 


CH09LNL 


1171 


403912 


RTA000026S7FJ. 19.2. P.Seq 




M000403l"A:H03 


CH14EDT 


1172 


401471 


RTA00002685F.0. 10.2. P.Seq 




M0003962*B:F01 


CH12EDT 


1173 


404362 


RTA00002687F.O.06. LP.Seq 




M00040342B:D12 


CH14EDT 


1174 


403849 


RTA00002687F.n.09. LP.Seq 




M00040333D:G05 


CH14EDT 


1175 


395617 


RTA00OO:687F.b. 15. LP.Seq 




M0003976"B:A04 


CH14EDT 



WO 01/02568 



PCT/US00/18374 





CLUSTER 


SEO NAME 


ORIENTATION 
F 


CLONE ID 
M00039624A:H09 


LIBRARY 
CH12EDT 


1 17A 

t 1*77 
I I / f 


JO 1 700 

AftAAAA 


RTA00002635F.o.0L2.P.Seq 
RTA00002687F.O.22. LP.Seq 


F 


M00040347D:F09 


CH14EDT 


1 17R 


447795 


RTA00002689F.e.06.3. P.Seq 


F 


M00042S95CG01 


CH15CON 


1 1 70 


lOl J7 


RTA00002708F.f. 1 0. 1 .P.Seq 


F 


M00004139B:B!0 


CH01COH 


i tan 




RTA00002687F.a.05. LP.Seq 


F 


M00039746CH06 


CH14EDT 


1 1 ft 1 
l 1 0 1 


■tJ J J I — 


RTA00002693 F.a.2 1 .2.P.Seq 


F 


M00043078D:D04 


CHI9COP 


1 1 ft"> 


4fttl 1 77 


RTA0000268~F.d. 17.1 .P.Seq 


F 


M0003995IB:B12 


CH14EDT 


i 1 ^ 

I i O J 


HUU7 / J 


RTA000026S5F.c06.2.P.Seq 


F 


M00039374CH12 


CH12EDT 


1 1 9. A 
I I 54 


jsoi os 

**ju i yo 


RTA0000269 1 F.e.23.2.P.Seq 


F 


M00043405A:D1 1 


CH17COHLV 




as i srp 


RTA0000269 1 F.t".03 .2.P.Seq 


F 


M00043406B:G12 


CH17COHLV 


t 1 00 


(IStlll 1 4 


RTA00002693F.i:.18.2.P.Seq 


F 


M00043220B.C04 


CH19COP 


1 t 87 
113/ 


J.>^7S7 
*rj J / J— 


RTA00002693F.b.02.2.P.5eq 


F 


M0004308ID:F05 


CH19COP 


1 l 00 




RTA00002687F.g.03. 1 .P.Seq 


F 


M00040207B:D08 


CH14EDT 


I 10" 


4f» 1^7 1 


RTA00002687F.h. 19. 1 .P.Seq 


F 


M00040294D:D12 


CH14EDT 


1 1 QCi 
. 1 ly\) 


1 4JOJ 


RTA00002637F.f.08. 1. P.Seq 


F 


M00040203B:A05 


CH14EDT 


1101 


dfU 1 6 1 

tUH 1 O 1 


RTA00002637F.e.20. 1 .P.Seq 


F 


M00039958CB09 


CH14EDT 


1192 


403274 


RTA000026S7F.b. 1 0. 1 .P.Seq 


F 


M00039766A:G07 


CH14EDT 


1193 


373465 


RTA0000267 1 F.o.09. LP.Seq 


F 


(VlUUU j iS0ljA.nl- 


CH09LNL 


1 1 O A 




RTA00002686F.m.08. 1. P.Seq 


F 


M00040265D:C08 


CH13EDT 




HU-.'t 1 


RTA00002686F.1. 16. 1 .P.Seq 


F 


M0004026IC:F0l 


CH13EDT 


i iyo 




RT A0OOO* 1 670F.p. 1 2. 1 .P.Seq 


F 


M0003358!D:D08 


CH09LNL 


1 IV / 


4j J7 JO 


RT A0000 -) 694F.d.24. 1 .P.Seq 


F 


M00043528C;A02 


CH20COHLV 


lino 

1 1 Vo 




RTAOOO0" ! 672F.i.02.2. P.Seq 


F 


M00039013D:F02 


CH09LNL 


1 t OO 


HU — O— 4 


RTA0Q002686F.p. 13.1 .P.Seq 


F 
F 


M00040287D:D07 
M00040233A:H02 


CH13EDT 
CH13EDT 


l_uu 

1 7A 1 




RTA000026S6F.J. 1 6. 1 .P.Seq 
RTA00002690F.clL2-P.Seq 


F 


iM0004276°C:E09 


CH16COP 


17PP 


7^6704 


RTA00002664F.a. 1 LI .P.Seq 


F 


M00023352D:H05 


CH04MAL 




771 0Q7 


RTA00002690F.b.23.2.P.Seq 


F 


M00042756D:A10 


CH16COP 


17PLL 
1 — 




RTA000026S5F.2.I7.2.P.Seq 


F 


M000595I7B:G12 


CH12EDT 


1205 


235855 


RTA0000266TF.O.06. 1 .P.Seq 


F 


M00032876C.D06 


CH0SLNH 


1206 


402789 


RTA000026S6F.S. 1 6. 1 .P.Seq 


F 


(VIUUU4U 1 oj 1 A.rw ' 


CH13EDT 


1207 


19826 


RTA000027l0F.k.05.LP.Seq 


F 


M00022467C:B12 


CH03MAH 


1 7nQ 
1 _uo 


JoU I J ' 


RTA000026S2F.h. 19. 1 .P.Seq 


F 


M00039984D:G12 


CH09LNL 


1209 


401187 


RTA000026S5F.e.l5.2.P.Seq 


F 


M00039500CC04 


CH12EDT 


1210 


427346 


RTA00002665F.b.OL3. P.Seq 


F 


M0002S066C:D07 


CH08LNH 


121! 


402366 


RTA00002686F.C. 15. LP.Seq 


F 


MUUIHU l;JSt3.nu: 


CH13EDT 


1212 


376712 


RTA0000267~F.c 1 3. 2. P.Seq 


F 

r- 


M00039343B:F12 


CH09LNL 
CH12EDT 


1213 
1214 
1215 


401655 
400147 
400864 


RTA00002685Fc.22.LP.Seq 
RTA00002635F.il. 10. 1 .P.Seq 
RTA000026S5 F.g. 1 7. 1 .P.Seq 


r 
F 
F 


M00O395t5A:A06 
M00039517B:G12 


CH12EDT 
CH12EDT 


1216 
1217 


451600 
400147 


RTA00002691F.b.l9.3.P.Seq 
RTA00002685F.g.t0.2.P.Seq 


F 
F 
F 


M0004jj28D:HO- 
M000395I5A:A06 
M0003937SD:H07 


CH12EDT 
CH12EDT 


1213 
1219 


401655 
449307 


RTA000026S5F.C.22.2. P.Seq 
RTA0OO026°OF.a. 10.3. P.Seq 


F 
F 


M00042431D:C10 
M00040366A.B01 


CH16COP 
CH14EDT 


I22C 
1221 
1222 


403121 
451718 
294345 


RTA0000263SF.a.0L2. P.Seq 
RTA000026 Q 2F.e.24. LP.Seq 
RTA000026S5F.S. 14. LP.Seq 


F 
F 


M00043044B:A12 
M00039515D:C1 1 


CHI SCON 
CH12EDT 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


1223 


136541 


RTA000027l2F.p.23.2.P.Seq 


F 


M00027181D:A05 


CH04MAL 


1224 


403898 


RTA00002687F.a.O:>.2.P.Seq 


F 


M00039746CH06 


CH14EDT 


1225 


403541 


RTA00002687F.p.20. 1 .P.Seq 


F 


M00040364A:E05 


CHUEDT 


1226 


450773 


RTA0000269 1 F.d.24.3.P.Seq 


F 


M00043383D:A02 


CH 17C0HLV 


1227 


376236 


RTA00002685F.I.24.2.P.Scq 


F 


M00039595C:E05 


CH12EDT 


1223 


422357 


RTA00002688F.C.2 1 . 1 .P.Seq 


F 


M00040385C:D02 


CHUEDT 


1229 


404532 


RTA00002687F.p.l0.2.P.Seq 


F 


M000403513:F02 


CHUEDT 


1230 


403693 


RTA00002687F.j.23.1. P.Seq 


F 


M00040317D:F02 


CHUEDT 


1231 


403693 


RTA00002687F.j.23.2.P.Seq 


F 


M00040317D:F02 


CHUEDT 


1232 


401515 


RTA00002685F.o.02.2.P.Seq 


F 


M0003962-iB:F12 


CH12EDT 


1233 


404532 


RTA00002687F.p. 1 0. 1. P.Seq 


F 


M0004035i3:F02 


CHUEDT 


1234 


452077 


RTA00002692F.d.0L2.P.Seq 


F 


M00043002A:£05 


CHI SCON 


1235 


18003 


RTA0000271 1 F.b.04. 1. P.Seq 


F 


M00022S21CC09 


CH03MAH 


1236 


377014 


RTA00002682F.f. 13. 1 .P.Seq 


F 


M00039973D:C08 


CH09LNL 


1237 


404232 


RTA00002687F.n. l2.2.P.Seq 


F 


M00040334D:C07 


CHUEDT 


1238 


404232 


RTA00002687F.n. 12.1. P.Seq 


F 


M00040334D:C07 


CHUEDT 


1239 


406263 


RTA00002685F.d. 14.1. P.Seq 


F 


M00039493A:C04 


CHI2EDT 


1240 


452077 


RTA00002692F.c.24.2.P.5eq 


F 


M0004300ZA:E05 


CHI SCON 


1241 


454349 


RTA00002693Fx.09.2.P.Seq 


F 


M000431 33 3:CU 


CH19COP 


1242 


447671 


RTA00002689F.e. 12.1. P.Seq 


F 


M00042904B:E07 


CH15CON 


1243 


447603 


RTA00002693F.b.l4.2.P.Seq 


F 


M00043095A.-F09 


CH19COP 


1244 


456764 


RTA00002694F.C. 14.1. P.Seq 


F 


M00043465B:HO2 


CH20COHLV 


1245 


401827 


RTA00002686F.1. 19.1. P.Seq 


F 


M00040262B:B06 


CH13EDT 


1246 


404520 


RTA00002687FT.05. 1 P.Seq 


F 


M00040202A:F05 


CHUEDT 


1247 


449798 


RTA0000269lF.d.02.3.P.Seq 


F 


M00043366A;A02 


CHI 7COHLV 


1243 


450993 


RTA00002691F.cl2.5.P.Seq 


F 


M00043350D:Bll 


CH17COHLV 


1249 


377471 


RTA0000269lF.c.02.3.P.Seq 


F 


M00043339A:Fll 


CH17COHLV 


1250 


400404 


RTA00002686F.a. 17.1. P.Seq 


F 


M0003975ZB:G08 


CHUEDT 


125! 


19106 


RTA00002691F.e.08.2.P.Seq 


F 


M0004338 C ?C:E03 


CHI 7COHLV 


1252 


404024 


RTA00002687F.e. 18.1. P.Seq 


F 


M0003995SA:A08 


CHUEDT 


1253 


446404 


RTA00002689F.b. 14. 1 .P.Seq 


F 


M00042566C.C05 


CH15CON 


1254 


392921 


RTA00002677F.k.I2.2.P.Seq 


F 


M0003941 !C:E0" 


CH09LNL 


1255 


376850 


RTA00002678F.e.lO.:. P.Seq 


F 


M00039458B:HH 


CH09LNL . 


1256 


45301 1 


RTA00002692F.f.l0.2.P.Seq 


F 


M00043066B:Hi 1 


CHI SCON 


1257 


23481 1 


RTAO00O269lF.a.03.3.P.Seq 


F 


M00042352D:C01 


CH17COHLV 


1253 


402708 


RTA00002686F,m. 11.1 .P.Seq 


F 


M0004026"A:E06 


CH13EDT 


1259 


451013 


RTA00002691F.f.08.2.P.Seq 


F 


■ M0004340PB:B03 


CH17COHLV 


1260 


45301 1 


RTA00002692F.f. 10.1. P.Seq 


F 


M000430663:H]1 


CH18CON 


1261 


380462 


RTA00002670F.n.24.2. P.Seq 


F 


M000335"0B:E06 


CH09LNL 


1262 


379602 


RTA000026S1F.C.2I.:. P.Seq 


F 


M0003QS55C.F01 


CH09LNL 


1263 


403396 


RTA00002687F.a.04. 1 P.Seq 




MOOO.^ :40L .nUJ 


CHUEDT 


1264 


403397 


RTA00002687F.h.02. 1. P.Seq 




M00040:i9B:D02 


CHUEDT 


1265 


271723 


RTA00002686F.b.05. 1 .P.Seq 




MO003^755A:308 


CHUEDT 


1266 


451379 


RTA00002691F.b.l2.2.P.Seq 




M00043312C:E08 


CH17COHLV 


1267 


456624 


RTA0000:694F.e.02. 1 .P.Seq 




M000456163:F02 


CH20COHLV 


1263 


375433 


RTA00002686F.O.14.1. P.Seq 




M00040:"4A:D07 


CH13EDT 


1269 


402229 


RTA00002686F.i.09.t. P.Seq 




M00040::iA:GU 


CH13EDT 



WO 01/02568 



PCT/US00/18374 



SEQ 1 
ID 1 


CLUSTER 


SEO NAME 


ORIENTATION 


CLONE ;d 


LIBRARY 


1270 


377039 


RTA00002686F.0. 12.1. P.Seq 


F 


M000402SOC:H05 


CH13EDT 


1271 


18041 


RTA00002710F.H.2 I.I. P.Seq 


F 


M00022262DG03 


CH03MAH 


1272 


401381 


RTA00002685F.O.08.I. P.Seq 


F 


M00039626D:F04 


CH12EDT 


1273 


428491 


RTA00002666F.C.05 . 1 .P.Seq 


F 


M00032535D:H0I 


CH08LNH 


1274 


54656 


RTA00002661F.i.22.2.P.Seq 


F 


M000043"2B:F07 


CH01COH 


1275 


379183 


RTA00002679F.U7.1.P.Seq 


F 


M000396S8C.G06 


CH09LNL 


1276 


25594 


RTA0000271 1 F.f.07. 1 .P.Seq 


F 


M00022963B:E02 


CH03MAH 


1277 


403355 


RTA00002687F.d.l LI. P.Seq 


F 


M0003994SD:D1 1 


CH14EDT 


1278 


16789 


RTA00002709F.b.09.2.P.Seq 


F 


M00005382B:F08 


CH02COH 


1279 


23292 


RTA00002708F.c.02.l.P.Seq 


F 


M00003750D:E06 


CHOICOH 


1280 


373982 


RTA00002673F.b.24.2.P.Seq 


F 


M0003905SA:A04 


CH09LNL 


1281 


373982 


RTA00002673F.C.0 1 .2.P.Seq 


F 


M000390r3A:A04 


CH09LNL 


1282 


44991 1 


RTA0000269 1 F.e.02.2.P.Seq 


F 


M0004333^B:302 


CH17COHLV 


1283 


450633 


RTA0000269lF.f02.2.P.Seq | 


F 


M00043405CG12 


CH17COHLV 


1284 


23939 


RTA00002713FJ.U.LP.Seq 


F 


M00027486A:F06 


CH04MAL 


1285 


450633 


RTA0000269IFT.02.1. P.Seq 


F 


M0004340:C:GI2 


CH17COHLV 


1286 


379122 


RTA00002672F.n. 1 4. 1 .P.Seq 


F 


M000390:^B:F09 


CH09LNL 


PS7 


449429 


RTA00002690F.a. 16.3-P.Seq 


F 


M0004243"A:D04 


CH16COP 


1288 


430578 


RTA0000266SF.2. 18.1 .P.Seq 


F 


M000329S-CG05 


CHOSLNH 


1289 


425824 


RTA00002687F.D. 17. LP.Seq 


F 


M0003976"C:E12 


CH14EDT 




425824 


RTA000026S7F.b. 1 7.2. P.Seq 


F 


M0003976"C:E12 


CH14EDT 


1291 


401266 


RTA000026S5F.U l.2.P.Seq 


F 


M00039535D:D10 


CH12EDT 


1292 


377949 


RTA00002674F.p.04.l. P.Seq 


F 


M00039200A:C10 


CH09LNL 


1293 


12926 


RTA00002710F.e.2 1.1. P.Seq 


F 


M00022005C.C06 


CH03MAH 


1294 


378242 


RTA00002679F.c.20.2.P.Seq 


F 


M00039664D:G07 


CH09LNL 


1295 


401781 


RTA00002686F.e.08. 1 .P.Seq 


F 


M00040160B:A10 


CH13EDT 


1296 


453101 


RTA00002693F.C. 16.2. P.Seq 


F 


M00043U33:A10 


CH19COP 


1297 


377592 


RTA00002677F.1.12.2.P.Seq 


F 


M000394if D:E01 


CH09LNL 


1298 


404340 


RTA00002687F.b.O>. 1 .P.Seq 




M0003976-C:D07 


CHUEDT 


1299 


400968 


RTA000026S5F.h.0 1.2. P.Seq 


i p 


M000395:iD:H03 


CH12EDT 


1300 


400968 


RTA000026S5F.g.24.2. P.Seq 




M0003952;D:H03 


CH12EDT 


1301 


374417 


RTA00002671F.j.l-v3. P.Seq 




M000383i:C:Gl I 


CH09LNL 


1302 


374621 


RTA00002675F.p,02. 1 .P.Seq 




M00039263D\A12 


CH09LNL 


1303 


19063 


RTA00002708F.L 14.1. P.Seq 




M0000436!A:HO2 


CHOICOH 


1304 


135941 


RTA00002713F.g.06.l.P.Seq 




M0002735^3:G05 


CH04MAL 


1305 


403355 


RTA00002687F.d.ll.2.P.Seq| F 


M000399-i$D:Dl 1 


CHUEDT 


1306 


375226 


RTA00002677F.m.0S.2.P.Seq| F 


M000394rC:AO! 


CH09LNL 


1307 


222658 


RTA00002664F.e. 14.2. P.Seq | F 


M00027I033:A09 


CH04MAL 


1308 


447978 


RTA00002690F.d.ll.3.P.Seq 




M00042SO0A:AO3 


CH16COP 


1309 


431346 


RTA00002669F.g.24.1. P.Seq 




M0003321SA:C04 


CHOSLNH 


1310 


455579 


RTA00002694F.a. 10.1. P.Seq 




ivlUUlM.. D . rUO 


CH^OCOHLV 


1311 


13406 


RTA00002709F.1. 14.1. P.Seq 




M0000712-DH10 


CH02COH 


1312 


378364 


RTA00002674F. 0.17.1. P.Seq | F 


M00O3^l°cD A07 


CH09LNL 


1313 


373788 


RTA0000267 1 F.c. 1 6. 2. P.Seq 




M0003S2?^A:GO8 


CHO^LNL 


1314 


403548 


RTA000026SSF.a.l0.2.P.Seq 




M000403o.O:E09 


CHUEDT 


1315 


22425 


RTA00002709F.c.08.2.P.Sea 




M000054OSA:H06 


CH02COH 


1316 


452238 


RTA00002692F.C.2 1 .2. P.Seq 




M00042°°$A.G04 


CHI SCON 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 




LIBRARY 


1317 


446680 


RTA00002689F.C.04. LP.Seq 


F 


M000426 r s U. hU4 


ru | srON 
v« n i J>w vyt^ 


1318 


142922 


RTA00002712F.g.02.1. P.Seq 


F 


M00026ooOd.LUD 


pUAilM A I 


1319 


450196 


RTA0000269lF.c.l9.3.P.Seq 


F 


M0004jjr9B.UlU 




1320 


26017 


RTA00002709F.d.04. 1 .P.Seq 


F 


MOOOO26O i D:UUo 


rnmrriH 


1321 


380355 


RTA00002670F.O.06. 1 .P.Seq 


F 


M000jj: . OC.CIO 


runoi \ii 


1322 


25232 


RTA00002710F.n.22.I.P.Seq 


F 


M0002266 . U.tiUi 


\* nuj iviM n 


1323 


378952 


RTA00002683 F.h. U . 1 .P.Seq 


F 


M00040Q (Od.dU/ 


rw A oi mi 


1324 


404487 


RTA00002687F.C. 1 3-2.P.Seq 


F 


M000j994ob.r IU 




1325 


48482 


RTA000027 1 2F.p.06. 1 .P.Seq 


F 


M00027 I ^9U.rU-> 




1326 


373705 


RTA00002673F.3.13.1. P.Seq 


F 


MOOQjVO^-Cr U / 




1327 


373705 


RTA00002673F.a.I3.2.P.Seq 


F 


M000 J 90 r_C.rU / 




1328 


21162 


RTA00002709F.C.03.1. P.Seq 


F 


k (ArtArt; 1 * O D - l""\A 1 

M0000D4—VD.UU1 




1329 


15203 


RTA000027 1 0F.a.2 1 . 1 .P.Seq 


F 


M000O/9* _b.HI-£ 


THO^N/f AH 


1330 


21162 


RTA00002709F.c.03.2.P.Seq 


F 


M0000^449B.Uu I 




1331 


401013 


RTA00002685F.0. 16.2. P.Seq 


F 


M0003964 1 A:A05 


ru i TpnT 

n i -tu i 


1332 


404449 


RTA00002637F.c.04.2.P.Seq 


F 


M000j9/ ^0C:E04 


r*u i jcnT 
i_ n hcu i 


1333 


429672 


RTA00002668F.b. 10. 1 .P.Seq 


F 


M00032909A:B06 




1334 


48541 


RTA000027 1 2F.L07. 1 .P.Seq 


F 


M000269_-C:BO_ 


rlu4iVl/-\L 


1335 


378424 


RTA00OO268 1 F.a.03.2.P.Seq 


F 


M000j98j9B.BOI 


runoi mi 


1336 


49540 


RTA000027 1 2F.d.24. 1 .P.Seq 


F 


M0002j.?99C:E 10 


C UAJV1 A I 


1337 


379170 


RTA00002672F.L21. LP.Seq 


F 


MOOOj90 16D:G06 


runoi Ml 


1338 


179540 


RTA00002683F.o.20.2.P.Seq 


F 


M00040 lOOC.tU^ 


rWDQI M! 


1339 


451269 


RTA0000269lF.f.l 1.1. P.Seq 


F 


M0004j4 ! I S.UUo 




1340 


449832 


RTA0000269lF.e.l3.2.P.Seq 


F 


M 0004 j j 9. vA: BOS 


ru i imu\ V 


1341 


380119 


RTA0000267QF.m.20.2.P.Seq 


F 


M00033560D:G07 


runQI MI 


1342 


153094 


RTA00002714F.a. 12.1. P.Seq 


F 


M0002 / /4.: A: CO j 




1343 


448749 


RTA00002690F.d.l4.2.P.Seq 


F 


M00042806C:F07 


ru i Af*OP 


1344 


448749 


RTA00002690F.d. 14.3. P.Seq 


F 


MOOO^ZSOcC.FO? 


ru | ApOP 
\^ n i u^- v« 


1345 


454816 


RTA00002693F.b. 16.1. P.Seq 


F 


M0004^09oA:O04 


ru i prop 


1346 


374744 


RTA00002670F.i.l6.2.P.Seq 


F 


M000jj4_ D.rUl 


runO! Ml 


1347 


404449 


RTA00002687F.C.04. 1 .P.Seq 


F 


MOOOjV / . L/C .hU4 


ru UPDT 


1348 


58005 


RTA00002661F.h.l4.l.P.Seq 


F 


\ if\r\f\c\ IT 1 T-C rt ^ 

M000U4J--U . cUj 


CH01COH 


1349 


451379 


RTA00002691F.b.l2.3. P.Seq 


F 


M0004jj I -L.tUo 


ru 1 TfOHLV 


1350 


456323 


RTA00002694F.d.2 1 . 1 .P.Seq 


F 


M0004j^_od.U lu 


rt-pornHLV 


1351 


455957 


RTA00002694F.C. 1 5. 1 .P.Seq 


F 


M0004j46:^ -AUj 




1352 


428063 


RTA00002666F.I.05. LP.Seq 


F 


M000j26joL.vjU5 


ri-tnsi MH 


1353 


374722 


RTA00002676F.j.l9.3.P.Seq 


F 


M000j9j 1 UA.LU/ 


rHOQf ML 


1354 


428407 


RTA00002665F.p. 12.1. P.Seq 


F 


« /AAiVIC ' ,"\ T*\ . CI 1 1 

M000j2d t L'U.r li 


rwnsi mh 


1355 


378000 


RTA00002681F.j. 16.1. P.Seq 


F 


M000j98o U.LU4 


rHflOl ML 


1356 


452717 


RTA00002692F.b.l7.2.P.Seq 




M0004ivooc tuo 


ruiSCON 


1357 


378000 


D t a n/ino ~* 6£ 1 F i 1 n ^ P Sea 


F 


M000398S"D:C04 


CH09LNL 


1358 


448356 


RTA00002690F.c.03.3.P.Seq 




M00042760A:C12 


CH16COP 


1359 


456629 


RTA00002694F.d.04. 1 .P.Seq 




M0004349 l :C:F04 


CH20COHLV 


1360 


431346 


RTA00002669F.g.24.2.P.Seq 




M0003321SA-.C04 


CHOSLNH • 


1361 


377206 


RTA0000:6S2F.m. 14.1. P.Seq 


F 


M000400 1 5C :F08 


CHO^LNL 


1362 


453036 


RTA00002692F.b.l l.2.P.Seq 


F 


M00042960D-.H08 


CHI SCON 


1363 


402632 


RTA000026S6F.g.l5.I.P.Seq 


F 


M0004018ZD:D06 


CH13EDT 



<k1 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


1364 


230532 


RTA00002664F.C.1 1.2.P.Seq 


F 


M00026901A:G07 


CH04MAL 


1365 


30755 


RTA00002663F.C.03. 1 .P.Seq 


F 


M00022138A;E05 


CH03MAH 


136.6 


451438 


RTA00002691F.d.23.3. P.Seq 


F 


M00043383C:FI2 


CH17COHLV 


1367 


379011 


RTA0000268 1 F.n.23. 1 .P.Seq 


F 


M00039903CD01 


CH09LNL 


1368 


404048 


RTA00002687F.g.0L LP.Seq 


F 


M00040206A:A07 


CH14EDT 


1369 


404048 


RTA00002687F.g.OL2.P.Seq 


F 


M00040206A:A07 


CH14EDT 


1370 


452398 


RTA00002692F.f.l7.2.P.Seq 


F 


M00043125C:A1I 


CHI SCON 


1371 


403686 


RTA00002687F.d.03. LP.Seq 


F 


- M00039946B:F08 


CH14EDT 


1372 


403686 


RTA00002687F.d.03.2.P.Seq 


F 


M00039946B:F08 


CH14EDT 


1373 


404048 


RTA00002687FX24.2.P.Seq 


F 


M00040206A:A07 


CH14EDT 


1374 


404048 


RTA00002687FX24. LP.Seq 


F 


MO004O206A:A07 


CH14EDT 


1375 


450627 


RTA0000269IF.f.01.2.P.Seq 


F 


M00043405C:G02 


CHI7COHLV 


1376 


375589 


RTA0OOO2680F.f.06.2.P.Seq 


F 


M00039794A:E04 


CH09LNL 


1377 


379011 


RTA0000268IF.n.23,2.P.Seq 


F 


M00039903C:D0l 


CH09LNL 


.1378 


16789 


RTA00002709F.6.09. 1 .P.Seq 


F 


MOO0O5382B:FO8 


CH02COH 


1379 


427346 


RTA00002665F.a.24.3.P.~Seq 


F 


M00028O66CD07 


CH08LNH 


1380 


49540 


RTA00OO27I2F.C.0 1.1. P.Seq 


F 


M00023399C:EIO 


CH04MAL 


1381 


14440 


RTA00002674F.e.l4.2.P.Seq 


F 


M00039129C:D04 


CH09LNL 


1382 


391401 


RTA00002682F.k.l LI. P.Seq 


F 


M00040004D:B03 


CH09LNL 


1383 


43782 


RTA00002662F.d.2t.2.P.Seq 


F 


M0000716: B:G 1 1 


CH02COH 


1384 


212635 


RTA00002666F.p.0LLP.Seq 


F 


M00032638D:Dll 


CH08LNH 


1385 


15618 


RTA000027 1 OF.o.05. 1. P.Seq 


F 


M0002268-A:C02 


CH03MAH 


1386 


18501 


RTA00002669F.g.23.3. P.Seq 


F 


M0O033217B-.H07 


CH08LNH 


1387 


400310 


RTA00002688F.b.05.2.P.Seq 


F 


M00040375OB06 


CHUEDT 


1388 


403796 


RTA00002687F.H. 1 7. 1 .P.Seq 


F 


M00040293D:G04 


CH14EDT 


1389 


452314 


RTA00002694F.3.2 1. 1 .P.Seq 


F 


M0004341 6C: A02 


CH20COHLV 


1390 


119179 


RTA000O27l2F.k.20.LP.Seq 


F 


M00027021A:G02 


CH04MAL 


1391 


167451 


RTA00002663F.J.I LI. P.Seq 


F 


M00022646A:H10 


CH03MAH 


1392 


450523 


RTA0000269 1 F.e. 1 9.2. P.Seq 


F 


M0004340iD:GOS 


CH17COHLV 


1393 


289535 


RTA00002693F.f.06. LP.Seq 


F 


M0004320Z3:FOl 


CH19COP 


1394 


374736 


RTA00002673F.o.08.2.P.Seq 


F 


M00039I 12B:C05 


CH09LNL 


1395 


373912 


RTA00002672F.n.01.2.P.Seq 


F 


M00039036C:B05 


CH09LNL 


1396 


134877 


RTA00002662F.d.05.2.P.Seq 


F 


M00007026B:H09 


CH02COH 


1397 


372811 


RTA00002670Fx.l2.2.P.Seq 


F 


M0003334"C:F02 


CH09LNL 


1398 


373296 


RTA00002672F.e.08.2.P.Seq 


F 


M00038994A:AIO 


CH09LNL 


1399 


373296 


RTA0OOO2672F.e.08. LP.Seq 


F 


M00038994.A:A10 


CH09LNL 


1400 


452903 


RTA00002692F.f.OS.2.P.Seq 


F 


M00043060D:G12 


CHI SCON 


1401 


450067 


RTA00002691 F.c.l 7,3. P.Seq 


F 


M00043352D:C03 


CH17COHLV 


1402 


451013 


RTA0000269 IF. f.08. LP.Seq 


F 


M0004340°B:B03 


CH17COHLV 


1403 


212635 


' RTA00002666F.O.24. LP.Seq 


F 


M000326SSD:Dll 


CH08LNH 


1404 


452367 


RTA00OO:692F.c.012.P.Seq 


F 




CHI SCON 

\,n iov. w i ^ 


1405 


450627 


RTA0000269 1 F.e.24. 1 , P.Seq 


F 


M0004340fC:G02 


CH17COHLV 


1406 


186438 


RTA0OO027I3F.L15. LP.Seq 


F 


M00027462A:D07 


CH04MAL 


1407 


431066 


RTA00002669F.C. 1 7.3. P.Seq 


F 


M00033IS^D:F03 


CH08LNH 


1408 


378912 


RTA00002672F.m.24.2.P.Seq 


F 


MO0039O36C:B05 


CH09LNL 


1409 


15731 


RTA00002709F.L13.LP.Seq 


F 


M000071 16C:G02 


CH02COH 


1410 


377187 


*RTA0O0O2683F.d.2L2.P.Seq 


F 


M00040C4"C:F05 


CH09LNL 



if 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


I41i 


376107 


RTA00002677F.a.08.2.P.Seq 


p 


tV100039333D:D09 


CH09LNL 


1412 


450580 


RTA0000269IF.c.20.3.P.Seq 


p 


iM00043359C:GOl 


CH17COHLV 


1413 


379942 


RTA00002679F.I.2U. P.Seq 


p 


M00039707A:D02 


CH09LNL 


1414 


375589 


RTA00002680F.f.06.1. P.Seq 


p 


M00039794A:E04 


CH09LNL 


1415 


375789 


RTA00002674F.aJ6.1.P.Seq 


p 


M00039I20CH03 


CH09LNL 


1416 


456227 


RTA00002694F.C. 16.1. P.Seq 


p 


M00043465C:C09 


CH20COHLV 


1417 


455852 


RTA00002694F.a.02.l. P.Seq 


p 


M00042592A:H10 


CH20COHLV 


1413 


25169 


RTA000027 IOF.m.05. 1 .P.Seq 


p 


M00022579C:C1I 


CH03MAH 


1419 


376524 


RTA00O02678F.h.23.2.P,Seq 


p 


M00039477A:B03 


CH09LNL 


1420 


449562 


RTA00002690F.b. 1 3.2.P.Seq 


p 


M00042515CF08 


CH16COP 


1421 


449562 


RTA00002690F.b.l3.3.P.Seq 


p 


M0OO425I5CF08 


CH16COP 


1422 


286001 


RTA00002690F.b.08.2.P.Seq 


p 


M0004251 1A;H04 


CH16COP 


1423 


286001 


RTA00002690F.b.08.3.P.Seq 


p 


M000425I 1A:H04 


CH16COP 


1424 


380322 


RTA00002683F.p.2 1.1. P.Seq 


F 


M00040106B:309 


CH09LNL 


1425 


401603 


RTA00002685F.f.23.2.P.Seq 


p 


M000395I0C:G02 


CH12EDT 


1426 


376541 


RTA00002678F.d.l3.2.P.Seq 


p 


M00039456A:C08 


CH09LNL 


1427 


449 t 23 


RTA00002690F.a. 1 3.3.P,Seq 


p 


M00042435A:AI 1 


CH16COP 


1428 


418358 


RTA00002686F.m.07. 1 .P.Seq 


p 


iV100040265D:B07 


CH13EDT 


1429 


380263 


RTA0OOO2689F.a.22.l. P.Seq 


F 


M00042543C:G04 


CH15CON 


1430 


455748 


RTA00002694F.b.06.l.P.Seq 


p 


M00043428D:G08 


CH20COHLV 


1431 


451679 


RTA00002693F.a.04.2.P.Seq 


p , 


M000426I2D:F06 


CH19COP 


1432 


396332 


RTA00002686F.k.l4.l. P.Seq 


p 


M00040252C:C06 


CH13EDT 


1433 


377578 


RTA0OOO2683F.b.U.2.P.Seq 


p 


M00040037A:E11 


CH09LNL 


1434 


20061 


RTA000027 lOF.m. 1 4. 1 .P.Seq 


p 


M00022597D:A06 


CH03MAH 


1435 


402494 


RTA00002686F.K. 16.1. P.Seq 


F 


M00040191A:B09 


CH13EDT 


1436 


372798 


RTA00002670F.C. 1 8.2.P.Seq 


F 


M00033349D:F05 


CH09LNL 


1437 


236295 


RTA00002679F.a.l9.2.P.Seq 


F 


M00039655B:H09 


CH09LNL 


1438 


451570 


RTA0OO02691F.c.03.3.P.Seq 


p 


M00043340B:H0S 


CH17COHLV 


1439 


35847 


RTA0OOO27O8F.h.03.I.P.Seq 


F 


M00004239B:FI 1 


CH01COH 


1440 


455706 


RTA00002694F.b. 1 0. 1. P.Seq 


p 


M00043433B:G09 


CH20COHLV 


1441 


346310 


RTA00002684F.d. 1 8. 1 .P.Seq 


F 


M00040I22D:A02 


CH09LNL 


1442 


139561 


RTA00002676Fj.09.3. P.Seq 


F 


M0003930SB:G08 


CH09LNL 


1443 


403200 


RTA00002687Fj.24.1. P.Seq 


p 


M000403I8A;B02 


CH14EDT 


1444 


401413 


RTA00002685F.i.03.2.P.Seq 


F 


M00039530B:E02 


CH12EDT 


1445 


448680 


RTA00002690F.b.02.3. P.Seq 


F 


M00042440B:E09 


CH16COP 


1446 


117060 


RTA00002679F.h.24. 1 .P.Seq 


F 


M00039686C:C05 


CH09LNL 


1447 


403200 


RTA00002687F.j.24.2.P.Seq 


F 


M00040318A:B02 


CH14EDT 


1448 


448589 


RTA00002690F.a.07.3. P.Seq 


F 


M00042349D:D07 


CH16COP 


1449 


373806 


RTA00002674F.O.02.I. P.Seq 


F 


M0O039179A:G09 


CH09LNL 


1450 


377055 


RTA00002682F.k.13.l. P.Seq 


F . 


M00040005B:C1 1 


CH09LNL 


1451 


373111 


RTA00002670F.n.l4.2.P.Seq 




M00033566C:E08 


CH09LNL 


1452 


12350 


RTA00002713F.3.05. 1. P.Seq 




M00027195C:E04 


CH04MAL 


1453 


450366 


RTA0000269lF.c.06.3.P.Seq 




M00043344D:E04 


CH17COHLV 


1454 


397851 


RTA00OO2680F.b.04.2.P.Seq 




M00039775A:A09 


CHO^LNL 


1455 


403200 


RTA00002687F.k.0 1.2. P.Seq 




M000403ISA:B02 


CH14EDT 


1456 


403200 


RTA00002687F.k.0 1 . 1 .P.Seq 




M000403iSA:B02 


CHUEDT 


1457 


401 142 


RTA000026S7F.i.24.2.P.Seq 




M000403I3C:D05 


CH14EDT 



WO 01/02568 



PCT7US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


1458 


375221 


RTA00002679F.k. 1 9. 1 .PSeq 


F 


MO0O397O2A:BO2 


CH09LNL 


1459 


403471 


RTA00002687F.a. 14. l.P.Seq 


F 


M00039~49D:D05 


CHI4EDT 


1460 


12270 


RTA0000271 i F.t.23. LP.Seq 


F 


M00023007CE10 


CH03MAH 


1461 


401013 


RTA00002685F.o.l6.LP.Seq 


F 


M00039641A:A05 


CH12EDT 


1462 


74344 


RTAQ0002661F.f. 10. l.P.Seq 


F 


M00003902A:C03 


CH01COH 


1463 


423432 


RTA00002687F.1. IO,2.P.Seq 


F 


M00040323CGI 1 


CH14EDT 


1464 


423432 


RTA00002687F.1. 10. LP.Seq 


F 


MO0O40323CG11 


CHI4EDT 


1465 


379560 


RTA00002682F.g.IS.I.P.Seq 


F 


M00039931A:E08 


CH09LNL 


1466 


122669 


RTA000027 1 2FX22. 1 .P.Seq 


F 


M00026357D:GI2 


CH04MAL 


1467 


373319 


RTA0000267IF.c.I7.2.P.Seq 


F 


M00038259B:A02 


CH09LNL 


1468 


448034 


RT.A00002690F.b.l6.2.P.Seq 


F 


M00042751C:C12 


CH16COP 


1469 


376366 


RTA00002677F.h.05.2.P.Seq 


F 


MO003939"B:H09 


CH09LNL 


1470 


452253 


RTA00002692F.f.04.2.P.Seq 


F 


M000430-ifD:G12 


CHI SCON 


1471 


401601 


RTA00002685F.f. 1 3.2.P.Seq 


F 


M00039508CG0I 


CH12EDT 


1472 


373647 


RTA00002672F.d.04. LP.Seq 


F 


M00038664C:E04 


CH09LNL 


1473 


379721 


RTA00002676F.b.20.2.P.Seq 


F 


M0003^276B:H09 


CH09LNL 


1474 


446404 


RTA00002689F.e.02.3.P.Seq 


F 


M00042SS7C:D07 


CH15CON 


1475 


403738 


RTA00002687F.Q. 1 0.2.P.Seq 


F 


iM0003 l )748A:Fl I 


CHI4EDT 


1476 


376887 


RTA00002674F.t'.23.2.P.Seq 


F 


iM00059135D:H02 


CH09LNL 


1477 


373787 


RTA00002677F.I.04.2.P.Seq 


F 


M0003 t )4|4D:G03 


CH09LNL 


1478 


401375 


RTA000O2685F.n. 04. LP.Seq 


F 


M00O396O-tB:E05 


CH12EDT 


1479 


401375 


RTA00002685F.n.04.2.P.Seq 


F 


M0003960-LB:E05 


CH12EDT 


1480 


403232 


RTA00002687F.g.20.2.P.Seq 


F 


M0004021SC:C02 


CH14EDT 


1481 


403232 


RTA00002687F.g.20. LP.Seq 


F 


M000402!8C:C02 


CHI4EDT 


1482 


449080 


RTA00002690F.a.04.2.P.Seq 


F 


M000423-TD:H1 I 


CH16COP 


1483 


430973 


RTA00002669F.a.03.4.P.Seq 


F 


M00033r63:E12 


CH08LNH 


1484 


374742 


RTA00002676F.c.l2.2.P.Seq 


F 


M0003^:79B:CI1 


CH09LNL 


1485 


449741 


RTA00002690F.e.23.2.P.Seq 


F 


M000428f63:H02 


CHI6COP 


1486 


45341 


RTA00002710F.U 9. LP.Seq 


F 


M00O224^9A:B02 


CH03MAH 


1487 


451220 


RTA0000269 1 F.f,07.2.P.Seq 


F 


M0004340SB:D1 1 


CH17COHLV 


1488 


22067 


RTA00002708F.f. 12. LP.Seq 


F 


M0000-iU0D:C03 


CH01COH 


1489 


378952 


RTA00002683F.h.lL2.P.Seq 


F 


M0004OO~0B:BO7 


CH09LNL 


t490 


401435 


RTA00002685F.n.l4.2.P.Seq 


F 


M0003%0"D:E08 


CH12EDT 


1491 


375284 


RTA00002676F.g.2 L2.P.Seq 


F 


M0003 l >2^SD:B04 


CH09LNL 


1492 


449080 


RTA00002690F.3.04.3. P.Seq 


F 


M00042347D:H1 1 


CH16COP 


1493 


37897 


RTA0000266lF.b.I5.LP.Seq 


F 


M0000t4~63:G10 


CH01COH 


1494 


7572 


RTA00002709F.h.03. l.P.Seq 


F 


MOOOQt>SO^B:B09 


CH02COH 


1495 


377076 


RTA000026S2F.f. 14. LP.Seq 


F 


M0003 L > i) "D:DI2 


CH09LNL 


1496 


374828 


RTA00002674F.m. 1 0.2.P.Seq 


F 


M0003^1 T 0A:B10 


CH09LNL 


1497 


400295 


RTA00002685F.a.l7.2.P.Seq 


F 


M0003^3o3A:C09 


CH12EDT 


1498 


401435 


RTA00002685Fn.14.LP.Seq 




M0003 l H)0"D:EOS 


CH12ED 1 


1499 


374680 


RTA00002676F.C. 14. LP.Seq 




M0003o:-QC;B08 


CH09LNL 


1500 


399018 


RTA00002684F.d.20.2.P.Seq 




M00040i::.A:A09 


CH09LNL 


1501 


376351 


RTA00002678F.c.l9.2.P.Seq 




M000}O4f2C:G09 


CH09LNL 


1502 


19699 


RTA0OOO27IOF.f.l8.LP.Seq 




M0OO22IO5C:Cl2 


CH03MAH 


1503 


394113 


RTA00002665F.d.l 5.3. P.Seq 




M0002S.U-iD:F05 


CH03LNH 


1504 


452652 


■RTA00002692F.a. 16. LP.Seq 




M00042ti2~C:D0I 


CHI SCON 



(O o 



WO 01/02568 



PC17US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


1505 


450791 


RTA00002691F.b.23.3.P.Seq 


F 


M0004333SB:A03 


CH17COHLV 


1506 


20112 


RTA000027I 1 F.b. 16. 1 .P.Seq 


F 


M00022S30D:D01 


CH03MAH 


1507 


455142 


RTA00002694Rb.08. 1 .P.Seq 


F 


M0004343iD:B08 


CH20COHLV 


1508 


1 17060 


RTA00002679F.i.Q U .P.Seq 


F 


M000396S6CC05 


CH09LNL 


1509 


447859 


RTA00002689F.J. 13.1 .P.Seq 


F 


M0004273"C:H04 


CH15CON 


1510 


452572 


RTA00002692F.e. 16. 1 .P.Seq 


F 


M00043034D:C01 


CHI SCON 


1511 


448639 


RTA00002690F.a.06.3. P.Seq 


F 


M00042348B:E05 


CHI6COP 


1512 


378947 


RTA00002683F.O. l2.2.P.Seq 


F 


. M00040098CBG1 


CH09LNL 


1513 


403599 


RTA00002687F.i.l2.2.P.Seq 


F 


iM00040299B:F!0 


CH14EDT 


1514 


404084 


RTA00002683F.d. 16.2.P.Seq 


F 


M000403923:H01 


CH14EDT 


1515 


375243 


RTA00002680F.d.24. 1 .P.Seq 


F 


M0003978SC:A01 


CH09LNL 


1516 


229665 


RTA00002664F.c.08.2.P.Seq 


F 


M00026885A:H09 


CH04MAL 


1517 


450270 • 


RTA00002691F.a. 18.3. P.Seq 


F 


M0004251SD-.D04 


CH17COHLV 


1518 


448841 


RTA00002690F.d. 10.3. P.Seq 


F 


M00042799D:FO8 


CH16COP 


1519 


447613 


RTA00002689F.C.1 1.1, P.Seq 


F 


M00042693D:E01 


CH15CON 


1520 


453909 


RTA00002693F.d.24.2.P.Seq 


F 


M00043173D:G03 


CH19COP 


1521 


400213 


RTA00002685F.a.06.2.P.Seq 


F 


M00039 1 8-3:309 


CH12EDT 


1522 


403738 


RTA000026S7F.aJ 0.1. P.Seq 


F 


M00039743A:F1 1 


CHI4EDT 


1523 


456725 


RTA00002694F.e. 14.1. P.Seq 


F 


M00043648A:G07 


CH20COHLV 


1524 


230842 


RTA00002665 F.n. 15.1 .P.Seq 


F 


M0003249ZA:C01 


CH08LNH 


1525 


450149 


RTA00002692F.a.20.2.P.Seq 


F 


M00042630A:C05 


CHI SCON 


1526 


34343 


RTA00002709F.a. 13.1. P.Seq 


F 


M00005297D:H08 


CH02COH 


1527 


403956 


RTAOO002688F.C. 12.2. P.Seq 


F 


M00040383A:H02 


CH14EDT 


1528 


375243 


RTA00002630F.e.0 1 .2.P.Seq 


F 


M0003978SC:A01 


CH09LNL 


1529 


375243 


RTA00002680F.d.24.2.P.Seq 


F 


M0003973SC:A01 


CH09LNL 


1530 


373647 


RTA00002672F.d.04.2.P.Seq 


- F 


M00033664C:E04 


CH09LNL 


1531 


376897 


RTA00002674F.L20.1. P.Seq 


F 


M0003916:3:H09 


CH09LNL 


1532 


23468 


RTA00002708F.e.02. 1 .P.Seq 


F 


M0000399iC:F06 


CHOICOH 


1533 


455134 


RTA00002694F.a.05. 1 .P.Seq 


F 


M00042593A:C02 


CH20COHLV 


1534 


455327 


RTA00002694F.a.22. LP.Seq 


F 


M00043417C:D05 


CH20COHLV 


1535 


455189 


RTA00002694F.C.09. 1 .P.Seq 


F 


M0004346iD:C02 


CH20COHLV 


1536 


455688 


RTA00002694F.c.l3.!.P.Seq 


F 


M00043476A:F07 


CH20COHLV 


1537 


456286 


RTA00002694F.b.23.1. P.Seq 


F 


M00043450C:C06 


CH20COHLV 


1538 


455833 


RTA00002694F.a.23. 1 .P.Seq 


F 


M00043418A:H10 


CH20COHLV 


1539 


456308 


RTA00002694F.d.22.t. P.Seq 


F 


M00043527C:E09 


CH20COHLV 


1540 


* 452720 


RTA00002694F.d. 14. 1 .P.Seq 


F 


M000435163:H09 


CH20COHLV 


1541 


455319 


RTA00002694F.b. 13.1. P.Seq 


F 


M00043437D:D04 


CH20COHLV 


1542 


455813 


RTA00002694F.C.24. 1 .P.Seq 


F 


M000434833:G10 


CH20COHLV 


1543 


451814 


RTA00002692F.e.20.2. P.Seq 


F 


M000430403:307 


CHI SCON 


1544 


448659 


RTA00002690F.b.07.3. P.Seq 


F 


M00042470C.E05 


CH16COP 


1545 


450578 


RTA0000269 1 F.b.20.j.P.beq 






CH17COHLV 


1546 


451193 


RTA00002691F.b.Ol.3.P.Seq 




M00043304C:D02 


CHI7COHLV 


1547 


451981 


RTAO0O02692F.C.23.2. P.Seq 




M00043001D:D03 


CHI SCON 


1548 


447859 


RTA00002689F.d.I3.2.P.Seq 




M0004273~C:H04 


CH15CON 


1549 


449415 


RTA00002690F.a.23.3.P.Seq 




M00042439B:D03 


CH16COP 


1550 


451193 


RTA00002691F.a.24.2. P.Seq 




M0004330-iC:D02 


CH17COHLV 


1551 


452032 


RTA00002692F.e. 04.2. P.Seq 
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326 


14947 


DT \ AAAAinAOir l> TJ 1 P 

R 1 AUUUUiyO-r.K.^J.I .r.oeq 


p 
r 


MAAAAA7J.^ A -PlAJ. 


PHn?POH 


327 


829D 


K 1 AUUVJU.yu jr.K.^ j. 1 .r.oeq 


p 
r 


\>rnnnn7n'i i A Pn^ 


PHn^P OH 


323 


136277 


DT \ A AA AO OA "7 C 1 1 1 I P Cjn 

R 1 AUUOOiyO/r.1. 1 j. 1. r.oeq 


p 
r 


X/tnnm o *> *; 7 n ■ n n A 




32V 


1T7C t 


DT \ AAAA^COTC 1 Is ! DCxi 

R 1 A'JUUU-oy /r.l. 1 j. i. r.oeq 


p 
r 


N/fnnnn-'^^^p- a j 

iVlUUUU-T—O— \- ..n.. 




3j0 


7869 


OT \ AAAAOO t TIT ; 1 C 1 D C Jn 

R 1 AUUU0»y I /r.|.13. 1. r.oeq 


p 
r 


m Ann ^ 7 7 j.0 n - nn ^ 




3j1 


156009 


OT \ AAAAOOATC L- AC 1 D C Jn 

R 1 AUUUUiyU/r.K.Uj.l. r.oeq 


p 
r 


iVyfAnnmin a • AA7 


runur AT4 


3j2 


9453 


D T \ AAAAOOATC 1, 1 I ID C 

R rA000UJyU/r.k.2l.l.r.oeq 


p 
r 


\^rAAAmiQR -R I 1 
IVIUUU- - --0D .D L 1 


run *\/f aH 


333 


186052 


R 1 A00UU-V 1 2r.n.Uo. 1. r.oeq 


p 
r 


\ fAAA171AJ.R -P 1 
MUUU- / J0-+D .£1: 


ryn/ivt a I 


334 


669 


DT \ AAAAOil | TC f OO 1 D C Jrt 

R 1 AUUUUiy 1 / r.T.l-. I .r.oeq 


p 
r 


VfAAA^777 ^n-T-TA"* 
MUUUJ- / -JU. nu_ 


pHnsr MK 

v^riv/OL-i^ii 


33D 


11609 


OTA AAAAO COAC f 11 I D Cj« 

R 1 AUUUOioyyr.t.ij.i.r.oeq 


p 
r 


\/f AA A Azl s A7 n - P A * 
iVIUUUUHJU / U.-CUJ 


PHniPOH 


336 


186075 


DT V AAAA^O MCI- 1 Q 1 P C»/i 

Rl AUUUUiyi ir.k.iy.i. r.oeq 


p 
r 


K.rnAA?7AS7P n 1 n 

IVIUUU- /UJ / V_ . U l\J 


PT-TOdVt AT 


3j7 


933 


O T \ AAAAOi"\ 1 IP 1 *>A 1 P C»,r. 

R 1 AUUUU_y 1 1 r.LiU. 1 .r.oeq 


p 
r 


KjTAAA77A^ 1 A - A A^ 
IVlUUU-i /Uo l.-\..-\Uo 


PHndVf AT 


3jS 


1 14j0 


D T \ AAAA^ ^ O A7 "> D 

R 1 AUUUU-oy_r.e.U/.-.r.oeQ 


p 
r 


vfnnnn^^n^R pn" 




3j9 


1839j8 


OT V AAAAOO 1 IT a 14 1 D C_»t, 

R 1 AUUUU-y 1 1 r.0.»4. 1 .r.oeq 


p 
r 


Vfnnn77 1 1 t- p i i 


PPTndXI AT 


340 


12394 


DT \ AAAAOO 1 <TT m l> 1 D C Jrt 

R 1 AUUOU-y Ljr.m. 13.-.r.oeQ 


p 
r 


LMUUUJ--+V / \J . D 1U 


PWnST MT4 


341 


186588 


OT* V AAAAIO 1 IC 1 A^ 1 D 

R 1 Au000-y 1 lr.l.Uj. 1. r.oeq 


p 
r 


V f AAA7 7AA_1 R * H AA 
1MUUU- /UO-tD . UUO 


P1-Tnd\T AT 








p 


M00022600D:B05 


CH03MAH 


343 


4727 


RTA00002905F.S. 19. l.P.Seo 


F 


M0000S059D:30S 


CH03M.\H 


344 


17048 


RTA00002SS7F.1.10.i.P.Seq 


F 


M00001416B:A05 


CHOiCOH 


345 


2354 


RTA000029 1 6F.O.03. 1 .P.Seq 


F 


M00032645D:C01 


CH08LNH 


346 


19S7 


RTA00002S94F.a.l3.1.P.Seq 


F 


M00003974D:E02 


CHOICOH 


347 


244S3 


RTAOO002S97F.i.2 1.1. P.Seq 


F 


M00004269A:Gii 


CHOICOH 


34S 


33337 


RTA00002S96F.f.08.i. P.Seq 


F 


M00004155A:K03 


CHOICOH 


349 


11641 


RT AO00O29 1 6F.m. 19. 1 .P.Seq 


F 


M00032637A:F09 


CHOSLNH 


350 


10307 


RTA00002910F.1.01.1. P.Seq 


F 


M00022995C :G0" 


CH03M.AH 



{15 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 




T IRR 4R Y 


351 


20388 


RTA00002906F.a.04. 1 .P.Sea 


F 


M00021700D:HOj 


CHOjMAH 


352 


24687 


RTA00002903F.m.02.1.P.Seq 


F 


M00007048B:EU 


CH02COH 


353 


10414 


RTA000029 1 9F.n. 19. l.P.Scq 


F 


M00033232B:C08 


CH08LNH 


354 


1 1053 


RTA00002S92F.h.l6.2.P.Seq 


F 


M00003820B:Fll 


/—> T T/"\ 1 rrttj 

CH01COH 


355 


6574 


RTA000029 1 7F.o. 17.1 .P.Seq 


F 


M00032797D:D08 


CHUoLiNH 


356 


18782 


RTA0O0O2905F.f.07. l.P.Scq 


F 


M00OO802lC:G12 


CHU3MAn 


357 


35896 


RTA00002896F.d.04. 1 .P.Seq 


F 


M00004146C:B04 


UHUli-Url 


358 


35 IS 


RTA00002930F.j. 10. l.P.Seq 


F 


M00056217D:E10 




359 


8320 


RT A000029 1 5 F. f . 1 7 . IP.Seq 


F 


M00028782A:F01 


L-rlUoLINrl 


360 


10208 


RTA00002897F.H.08. l.P.Seq 


F 


M00004251D:D03 




361 


2089 


RTA00002896F.2. 14. 1 .P.Seq 


F 


M00004159D:F12 


run i rnn 


362 


170919 


RTA00002909F.p.03. l.P.Seq 


F 


M00022727A:G01 


rUA^Avt AT4 


363 


8727 


RTA000029 1 7F.O.02. 1 .P.Seq 


F 


M00032791B.H1 1 




364 


33184 


RTA0000289SF.d.08. l.P.Seq 


F 


M00004324 A:D 10 




365 


27973 


RTA00002905F.2. 13.1 .P.Seq 


F 


M00008055D:G03 


CtlUJiViArl 


366 


15835 


RTA00002897F.k. 13. l.P.Seq 


F 


M00004278C:B10 


pun i rnu 


367 


10273 


RTA00002903F.n.03. 1 .P.Seq 


F 


M00007081B:E09 




368 


2832 


RTA00002899F.f. 03. l.P.Seq 


F 


M00004502A:D12 




369 


32022 


RT A00002903 F.m. 12. l.P.Seq 


F 


M00007060D:G07 


rumrnu 


370 


68176 


RTA00002893F.H. 1 ll.PSeq 


F 


M00OO3898C:A0l 


CHUlLUrl 


371 


29373 


RTA00002915F.n.l4.2.P.Seq 


F 


M00032508A:£03 




372 


23235 


RTA00002925F.k.02. 1 -P.Seq 


F 


M00039929B:E06 


r«Tj/lQf VT 

LnUVLiNL 


373 


12111 


RTA00002S95F.0. 17. 1 .P.Seq 


F 


M00004122C:DOl 




374 


5737 


RTA00002924F.k.02. 1 .P.Seq 


F 


M00039672C:D05 


LnUyL.NL 


375 


72475 


RTA00002915F.L15. l.P.Seq 


F 


M00032490D:EOS 




376 


7027 


RTA00002907F.O.O I. l.P.Seq 


F 


M00022264A:BO2 




377 


17165 


RTA00002903F.d. 19 A .P.Seq 


F 


M0OOO69O7A:CO9 




378 


26446 


RTA00002894F.m. 17. 1 .P.Seq 


F 


M0000404 / C : Buy 


pun i rou 


379 


6755 


RTA000029 lSF.k.24. 1 .P.Seq 


F 


MC0032944A:B07 




380 


9336 


RT A00002909F. n .02 . 1 . P. Seq 


F 


M00022703D:B 1 1 


CriUJm.-vn. 


381 


6960 


RTA000029 1 6F.O.0S. 1 .P.Seq 


F 


M00032647B:F06 


r'TJHQT NTH 


382 


472 


RTA00002911F.S.OL l.P.Seq 


F 


M00026936D:C07 


L.riU-+iVi.-\JL- 


383 


9460 


RTA00002908F.C.03. l.P.Seq 


F 


M00O22376D:DO5 




384 


10307 


RT A000029 10F.k.24. 1 .P.Seq 


F 


M00022995C:G07 


run^ \yf a w 
L^MUJivi.-vn 


385 


4623 


RTA00002923F.d.22. l.P.Seq 


F 


M00039222B:A04 


/-Tjnnr vtt 


386 


141167 


RTA00002905F.C.09. 1 .P.Seq 


F 


M000O79S0A:BOl 




387 


34011 


RTA00002S9SF.m. 1 0. 1 .P.Seq 


F 


M0000438^C:H12 


ryni row 


388 


5965 


RTA000029 l5F.a.07. 1 .P.Seq 


F 


M 0002 8 62 0C: C07 


r"i-rnQT nth 


389 


12336 


RTA00002915F.se.04. l.P.Seq 


F 


M000287S4A:D12 




390 


36492 


RTA00002393F.f.lS. l.P.Seq 


F 


M00003891B:H02 


r^urn i cnu 

K^tVJ ILUn 


391 


29803 


RTA00002908F.k.06. l.P.Seq 


F 


M00022467D:BLo 


V, tlVJJ iVlrvTl 


392 


4420 


RTA00002920F.a. 15. 1 -P.Seq 


F 


MUUlO J)J-Od , ouJ 


CH0SLNH 


393 


15097 


RTA00002923F.b.06. l.P.Seq 


F 


M00039175A:F01 


CH09LM 


394 


19133 


RTA00002894F.2.03. l.P.Seq 


F 


M00003993C:D07 


CH01COH 


395 


9S10 


RTA00002905F.C.03. 1 .P.Seq 


F 


M00007975C:A10 


CH03M.AH 


396 


31562 


RTAO0OO2S97F.a.09. 1 .P.Seq 


F 


M00004210A:A03 


CH01COH 


397 


1499 


RTA000029l2F.k. 12. l.P.Seq 


F 


M00027475D:A0l 


CH04M.AL 


398 


29531 


RTA00002907F.O.05. 1 .P.Seq 


F 


M00022265A:Fn 


CH03M.AH 


399 


4287 


RTA000029lSF.j.20. l.P.Seq 


F 


M0003292SC:D02 


CH0SL.N-H 


400 


2S660 


RTA00002905F.p. 1 1 . 1 .P.Seq 


F 


M00021690A:C03 


CH03MAH 



116 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


V.LUO I CK 






n n\ r F in 


r rao ad v 


t A | 

401 


4596 


RTA000O2S9SF.1.2 1. l.P.Seq 


F 


M00004j4lC:EOO 


CH01COH 


.1 AO 

402 


21774 


RTAQO0Q2ovor.e. 20. l.P.Seq 


F 


ll.f AAAA iinn, r\Al 

M00004j22B:DU3 


CH01COH 


403 


561 1 


n t i aaaaoa i ct? i o i n c 

RTA000029 loF.c. 1 2. 1 .P.Seq 


F 


M0002S7/4D:b 10 


CH08LNH 


404 


7030 


RTA00002894F.L 1 j. l.P.Seq 


F 


M00004042B:A1 1 


CH0ICOH 


405 


11736 


RTA0OO02S93F.e.09. l.P.Seq 


F 


M00004jjOA:A01 


CH01COH 


406 


94732 


r> » AAA/\^A 1 AT"* t 1 n P 

RTA000029 lOF.e. 17. 1 .P.Seq 


F 


M00022856D:A07 


t tai \ r a r r 


407 


30283 


RTA0000292oF.g. 19. 1. P.Seq 


F 


M0003925oD:B01 


CH09LNL 


a no 

408 


129779 


o t i aaaao aa a . i o i n c . _ 

RTA00002904F.a. 18. 1. P.Seq 


F 


1/AAAA7 1 1"\AT 

M00007 1doC:D07 


CH02COH 


409 


4635 


i~k -r» » aaaao aaac : o i i o o « _ 

RTA0OO029O0F.J .2 1 . 1 .P.Seq 


F 


M00005 j49C : C02 


CH02COH 


410 


5879 


RTA0000289jF.r.Q8. l.P.Seq 


F 


MOOOOjSSSB :F09 


CHOICOH 


41 1 


1 19206 


RTAOQOU2yU0r.m. lo.l.r.Sec 


tr 

r 


M0002 loOOD: A 1 1 


LriUjiVlAn 


A 1 T 

412 


£LC\AC 
0940 


K i AUuUU2yjUr.g. iy.2.r.oeq 


c 
r 


k >f A A A CCOOAD , IT ] A 


LHOLUIN 


A 1 1 


42462 


K 1 AUUUU2yU2r.r. 12. l.r.oeq 


r 


Kjf AAAAiC^ ^ I n.HAO 


LHU-LUH 


A 1 A 

414 


24285 


R l AOOUU2o9or.m. 17. l.P.Seq 


r? 

r 


M00004 1 89 A : L 1 2 


LrtULLUri 


A 1 <C 

410 


13769 


o a aaaaoaa i c it i d c 

Rl AU0UU29U 1 r.a. 1 / . 1 .r ,Seq 


r 


R A AAAAC in A. r\ A*7 

MOOOL) 5 4 2 3 (_ : UU / 


CHU2L.UH 


A \ £L 

41o 


17039 


o t* a aaaao on^c ; i t i o c 

RTAU00U2S9or.i. 14. l.P.Seq 


c 

r 


M00004169A:hU4 


CHUlLOH 


417 


14397 


RTAQ0002S9or.j.l 1.1. P.Seq 


r* 

r 


X f AAAA .1 1 "71 r~\ . n I o 

M00004172U:B 12 


AT rn | AALT 


A 1 O 

41a 


14351 


RTA00002SSSF.L'.2 1. 1. P.Seq 


r 


X TAAAA 1 .1 A .1 / — ' T~\ 1 1 

M0000 1444L:U1 1 


CHUICOH 


419 


5579 


RTA00002S9jF.j.I Ll. P.Seq 


r 


» / AAAA^A 1 1 \ \ AO 

M0O0O j9 1 4 A : AOS 


CHOICOH 


420 


24186 


RTA000029l4F.n.02. 1. P.Seq 


F 


* fAAAA 0*^^^0 DAO 

M00028j66B:B08 


CH0SLNH 


AO 1 

421 


1 1433 


RTA000U292 1 F.c.Oo. 1 .P.Seq 


F 


\ f AAA T *^ "» tOO. t~A ^ 

M0003 j j42 B : F0-? 


CHU9LNL 


A OO 

422 


186635 


D T* A AAAA1A | 1 r i' f\£. 1 D C » — 

RTA0Q00291 lF.t.Oo. 1. P.Seq 


F 


M00026907D:E07 


pr TA t \ f v T 

CH04MAL 


All 

42 J 


5955 


RTAOOOO-VlJr.a. lo. l.r.Seq 


r 


X IAAA1 0"7~7 1 \ .PA1 

M0002S77 1 A:h02 


rtJAOT \rLI 


AO ■! 

424 


220o3 


RTA000U_S94F.k.09. 1. P.Seq 


r 


fc. /fAAAA/IA 1 r~\ . /"* 1 1 

M000040j6D:C 12 


/— TT A 1 rALT 

CHUICOH 


423 


AO CO 

9259 


O T * AA/*\A1 0 1 C C U. Art 1 o c 

RTA000029 1 SF.b.09. 1 .P.Seq 


r 


X (AAA* 1 0 O TAA. p\ AA 

M000j2830D: DU2 


ptjAOf \rrj 


' /<1/C 

42 0 


254o7 


O T A AAAA"> AACr , 1 T> C „ 

RT A000O29U0 r .0.2 j . 1 .P.Seq 


rr 

r 


m ,f A A AO 1 C O I . A A 

M0002 1 Oo IL :C09 


CHUjMAH 


427 


8488 


O T A AAAA^A i 1 — • /»vrt 1 O P 

RTA0O0O29 1 6F.i .02. 1 .P.Seq 


F 


Hv f A A AO OCAAO ITAl 

M00032o90B:H01 


CHOSLNH 


428 


4884 


RTA00002919F. 0.12.1. P.Seq 


F 


X f A A A ^ A 1 O r\ f T 1 1 

M0O0j324SD:HI 1 


CHUSLNH 


429 


9804 


ot ^ rto/^.o'^o i c mine 

RTA00QO-9 L3r.c.l9. 1. P.Seq 


F 


l,f AAAIOTil lO F> A ~* 

M0002S764B:DOj 


pTJACI \TT_T 

CHOoLNH 


430 


179954 


RTA00002910F.J.04. 1. P.Seq 


F 


\ ^AAAA^A/" 1 V f\ A ^ 

M00022964A:BOj 


f~* \ IA"> \ l ITT 

CHlbMAH 


43 1 


186532 


RTA000O29 12F.U.0 1.1. P.Seq 


F 


\ .f AAAA O 1 OA/' D ( A 

M00027 1 89C : B 1 0 


/-" rjA i \ ( i i 

CHU4iVl.AJ - 


/in 
432 


l iuio 


Ota AAAm OO ,|C : t « t D Cm 

R I AUUUU_oy4Ki. lo. l.r.Seq 


r 


K.fAAAA.lAOAn, \ A| 

MUUUU4U2y U : A(J 1 


^ tja i rnu 


4j3 


8824 


OT \ AAAA^nA^C U 1 1 1 o C Jn 

R 1 AUUUU-VUjr.D. 1 / . l.r.oeq 


r 


\ fAAAAi<0"TD . r~~ AA 


L.rlU_LUM 


A 1 -1 

4j4 


4063 


R 1 AUUUU_y lor. K.U1. l.r.oeq 


r 


\ fAAA^O/l 1 "» \ VZ 1 1 

M00032ol jA:hi 1 


/~> MAC T \TU 


4JJ 


7964 


Rl AUUUUJSyor.i. lo. l.r.Seq 


r 


IK .1 A AAA, 1 1 1A \ - ITAl 

MUUUU4 1 /UA:rU3 


run i rnu 


/IK 

430 


9238 


OT \ AAAAin 1 cc : 1A I DC 

RTA00OO29l5r. j. 20. l.r.Seq 


r 


XitAAA^O.lT^D . \ A 1 ^ 

M0003247 jB : AUj 


CHUo LiNn 


A1H 

437 


284 1 


RTA000U-9 14F.1. lo. 1 .P.Seq 


F 


IJAAATO lAii \ . A ^ 

M0002 S 1 9 o A : Cj 0 J 


CHU^LNH 


43S 


ll 203 


RTA00002S86F.p. 16. 1 .P.Seq 


F 


k t AAA A 1 "» CT A PN T T AO 

M 0000 1 jS2D:H08 


pt T A 1 AI 1 

CHOICOH 


439 


8800 


OT \ AAHAT l 1 A OP 1A i T\ C ~ ^ 

RTA00002SSSF.C.20. 1. P.Seq 


F 


M00001444B:E04 


r* T TA i AAU 

CHOICOH 


440 


3224 


RTA0000-9 1 6F.d.2_v 1 .P.Seq 


F 


M000j2do6D:AOj 


CHO^LNH 


441 


95423 


RT A00002909F.k. 24. 1. P.Seq 


F 


M00022674C : H08 


CHO.-'M.^H 






k l «-\uuuu_y20r.L . 1 1 . .i.r.oeq 


r 


[VILHJU4UU 1 J\J . uuv 




443 


88052 


RTA0000:925F.p.ll.l.P.Seq 


F 


M00040041D:F01 


CH09LNL 


444 


32736 


RTAOOOOZ900F.1.20. 1. P.Seq 


F 


M00005367D:All 


CHOICOH 


445 


208 1 1 


RTA00002S96F.n. 14.1. P.Seq 


.F 


M00004192C:B06 


CHOICOH 


446 


I2856 


RTA0000290SF.b.07. 1 .P.Seq 


F 


M000223CSA:3ll 


CH03MAH 


447 


12I90 


RTA00002S99F.b. 10. i. P.Seq 


F 


M00004430B:310 


CHOiCOH 


448 


10546 


RTA00002901F.O.0S.1. P.Seq 


F 


M000056S9C:B02 


CH0ZCOH 


449 


2I041 


RTAOOOO2S98F.k.0S.l. P.Seq 


F 


M00004372A:E12 


CHOICOH 


450 


164S4 


RT AOOOO2S94F.C.04. 1 .P.Seq 


F 


M00003979B:A04 


CHOiCOH 



117 
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SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLOIVE ID 


LIBRARY 


451 


7741 


RTA00002S94F.i.0S. l.P.Seq 


F 


M0000402SB:FIO 


CHOICOH 


452 


14921 


RTA00002926F.c.l5.2.P.Se(i 


F " 


M00040031CE02 


CH09LNL 


453 


17571 


RTAOOO0290OF.m. 16. 1 .P.Seq 


F 


MO00O5375D:A10 


CH02COH 


454 


46881 


RTA0O00290 1 F.1.20. 1 .P.Seq 


F 


M00005622A:H02 


CH02COH 


455 


21533 


RTA00002S98Fil0.l.P.Seq 


F 


M00004365C:C09 


CHOICOH 


456 


19010 


RTA000029 1 6F.k.08. 1 -P.Seq 


F 


M00032614D:D03 


CH08LNH 


457 


48768 


RTA00002 SS6F.n.0 1 . 1 .P.Seq 


F 


M00001374CBIO 


CHOICOH 


458 


7515 


RTA00002S92F.p.22.2.P.Seq 


F 


- MOO003855CFO2 


CHOICOH 


459 


17326 


RTAOO002S98F.h.02. l.P.Seq 


F 


MO00O435OA.A04 


CHOICOH 


460 


3902 


RT A0000290 1 F.d. 1 7. 1 .P.Seq 


F 


M00005460D:C11 


CH02COH 


461 


12400 


RTA00002901F.d. 18.1. P.Seq 


F 


M00005461A:D12 


CH02COH 


462 


186543 


RTA000029 1 2F.a.06. 1 .P.Seq 


F 


M00027193C;A07 


CH04MAL 


463 


4063 


RTA00002916F.j.24.1.P.Seq 


F 


M00032613A:Ell 


CH08LNH 


464 


6267 


RT A000029 1 0F.d.20. 1 .P.Seq 


F 
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17237 


RTA00002901F.1. 12. LP.Seq 


F 


M00005616B:F07 


CH02COH 


638 


1 1 14S 


RTA00002900F.J. 1 S. t .P.Seq 


F 


M00005346D:A03 


CH02COH 


639 


14837 


RTA0000292^F.n.20. LP.Seq 


F 


M00040025A:304 


CH09LM 


640 


4343 


RTA00002S97F.1. 1 j. 1 .P.Seq 


F 


M000042S2B:D07 


CH01COH 


641 


1S6S6 


RTA00002S9SF.j. 16. 1 .P.Seq 


F 


M00004366D.CI1 


CH01COH 


04 Z 


lUUvU 


K 1 AUUUU-oyJr.n. iu.-.r.beq 


r 


MU0UU j b42 D : HOv 


CHOILOH 


643 


612 


RTA00002SS9F.d.l3.2.P.Seq 


F 


M0O00L535B:EO2 


CH01COH 


644 


10752 


RTA00002S92F.n.06.2.P.Seq 


F 


M00003S42D:D11 


CHOI COM 


645 


167203 


RTA00002914F.C. 14. LP.Seq 


F 


M0002S070A:H09 


CH0SLNH 


646 


21269 


RTA0000290iF.j. 15. LP.Seq 


F 


M00005570A:B0S 


CH02COH 


647 


186250 


RTA000029 10F.a.2 1 . 1 .P.Seq 


F 


M00022797D:A06 


CH03MAH 


64S 


24633 


RTA00002907F.L 19.2.P.Seq 


F 


M0002220SB:D03 


CH03MAH 


649 


12295 


RTA000029 lSF.c.02. 1 .P.Seq 


F 


M0003:S36B:A07 


CH0SLN"H 


650 


7870 


RTA00002905F.b.22. 1 .P.Seq 


F 


M00007973B:D11 


CH03M.AH 
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SEQ 
ID 


CLUSTER 


CCA M A VI'C 


no rcMT AT TOM 


CLOVE ID 


LIBRARY 


651 


12225 


RTA00002902F.d.08. l.P.Seq 


F 


MUUUUtO ojA.UU/ 


rumrnn 
LHU-LUn 


652 


7775 


RTA00002S92F.0. 12.2.P.Seq 


F 


\,f AAAA1 Q 1 ~" \ • LTA 1 
MUUUU jo4 ; A.nU4 


run i rr\u 


653 


14901 


RTA00002929F.t".2 1 . 1 .P.Seq 


F 


M0004Q j49 u : UU / 


r*u t lCPiT 
L.n I4tU L 


654 


6831 


RT A00002927F.b.2 1 .l.P.Seq 


F 


M000j94cSjA:U IU 


nrj i tcpiT 
Crl IJ.C.U t 


655 


10738 


RTA00002930F.b.08. l.P.Seq 


F 


MUUU42/.i4A.OU0 




656 


17986 


RTA00002932F.a.20. l.P.Seq 


F 


x„f AAA,no"7"i^"*.CA t 
MUUU4jy / ^L.rU4 




657 


23163 


RTA00002S95F.h.03. l.P.Seq 


F 


x *aaaa -iaq \ .uni 
MUQUU4UoOA.HUl 




658 


4838 


RTAO00O2923F.i. 15. l.P.Seq 


F 


xjcaaa^atq ♦ P\ ■ UH7 
MUUU jyio4U-rlU / 


punQI NTT 


659 


25386 


RTA0OOO2905F.e.05 . 1 .P.Seq 


F 


X/fAAAAQAATQ ■ CA*? 

MUUUUoUU / 1> . cUJ 




660 


13217 


RTA00002S87F.n.O 1 . t .P.Seq 


r 


\A AAAA 1 A** 1 R • PiA^i 

MUUUU 1 4 — 'D . U\JO 


PHOirOH 


661 


30656 


RT A00002906F. 1 .03 . l.P.Seq 


r 


\A AAAOOA^ ~* A • ClAS 

MUUUiiUj- A.wUj 


nuji»inn 


662 


7852 


RTA00002889F.e. 14. 1 .P.Seq 


rr 

r 


\A AAA A 1 < 1 *i R • A Cll 


V_ 1 Ivy L \_ \y 1 1 


663 


13217 


RTA00002S87F.m.24. 1 .P.Seq 


r 


AyfAAAAl 1 ~> R ■ Pi AA 

MUUUU 14 d. UUO 


rHOiroH 


664 


15152 


RTA00002925F1.24. l.P.Seq 


F 


\ A A A A *i Q Q 7 1 Q * W Cil 
MUUU->Vo / J D . MU4 




. 665 


24143 


RTA00002922F.0. 18. l.P.Seq 


F 


X/fAAAIOl l^Pt*PlA, 

Muuujy i4ju.c iu 


rT-TOQT N7 


666 


23872 


RTA00002S92F.L 13.1 .P.Seq 


F 


* jf AAA A1 Q 1 » D • A AA 

MUUUU Jo - J D - AUG 


rHniroH 


667 


13940 


RTA00002906F.2.23. l.P.Seq 


F 


MUUU 2 iyo / U.rlUO 


r'lJA'ix.r a u 
n. u j iv u-vri 


668 


25759 


RTA00002907F.m. 10. 1 .P.Seq 


F 


x/TAaatih ion-r'ni 
MUUUiJ-4yU.L.Ui 




669 


5761 


RTA00002924F.p.05. l.P.Seq 


F 


K,f AAA1CT7Q APt- A 1 A, 
MUUUjy /oOD. A iU 




670 


41703 


RTA0000290 lF.e.23. 1 .P.Seq 


F 


\A AAAA-^-? AAPl- C 1 1 

MUUUUDOUuD.c 1 1 




671 


7165 


RTA00002909F.i.06. l.P.Seq 


F 


\A AAAOK 1 Q A -PlAQ 

MUUU-i043A.UUo 




672 


41492 


RTA00002SS9F.m. 18. 1 .P.Seq 


F 


MUUUU 1 3 02 A.riUj 


rHniroH 


673 


9331 


RTA00002906F.2. 10. L.P.Seq 


F 


MUUU2 ly^^D.tiUo 


n v j iv i .-vri 


674 


7961 


RTA00002S87F.2.24. 1 .P.Seq 


F 


X A AAA A 1 1 DO D • D A ! 

muuuu i jyyts.isui 




675 


15367 


RT A00002S93F.n. 1 7. i .P.Seq 


F 


> jr AAA At AC C/" -LIAQ 

MOOuUjy^dCHUS 




676 


185628 


RTA000029 12F.f. 17. 1 .P.Seq 


F 


\jfAAAT7i i cir* ■CC\'\ 
MUUUi/ J li/L.LUJ 




677 


7386 


RT A0000289 1F.1. 14. 1 .P.Seq 


F 


x fnnAA"7£or\.nAQ 




678 


67391 


RTA00002S93F.p.07. l.P.Seq 


F 


X ,i AAAA' , A^v^*./ r ^A'5 

MUUUU jyoiCUUJ 




679 


46380 


RTA00002906FX 10. 1 .P.Seq 


F 


x fAArti i iV. Q .CAT 

MUUUz ly^JD.rU. 


rwo^x-f AH 


680 


14265 


RTA00002S92F.e.05.2.P.Seq 


F 


XrfAAAAIQA^ A -P 1 I 

MUUUUJoUoA.r 1 1 




681 


186478 


RT AO0OO29 1 2F. f.07 . 1 .P.Seq 


F 


XyfAAAT71 1 l^'COt 

MUUU 1 i J L ±U I 


THCUM A I 


682 


8192 


RTA000029 16F.m.07. 1 .P.Seq 


r 


x,f aaa^i/; ^ j o .ptAQ 
MUUUj-0j ! 4D .\J\jy 


CHOSLVH 


683 


13776 


RTA00002925F.1. 10. l.P.Seq 


F 


MUUU J? V . .r 1 1 




684 


11796 


RT A000029 1 2F.e.02. 1 .P.Seq 


tr 
r 


M" A A AO TOO, 1 A -riHS 
MUUU- / - 7 1 A.wUO 


CH04M AL 


685 


10827 


RT A000029 19F.i. 10. 1 .P.Seq 


r 


\yTAAA*5*l 1 JTP.RAQ 
MUUU J J 14 /tw.-DUCS 


PHOSI >jTI 

V^nUOL.! 11 ! 1 


686 


1482 


RT A00002925F.1. 12.1 .P.Seq 


F 


X/TAAAIOOTTQ -Pt 1 0 

Muuujyy/ 'D.L'i- 




687 


30300 


RTA00002906F.I;. 16. l.P.Seq 


F 


H f AAA1 t A 1 I \ -PlAO 

Muuui i y4 1 a. uuy 


pT-IATVf AH 


688 


10454 


RTA00002S90F.L 15. l.P.Seq 


F 


X A AAAA 1 C 1 T~\. \ 1A 

MOOOUlo-^U.A iU 


run i pdh 


689 


16649 


RTA00002907F.I.0 1 . l.P.Seq 


F 


\,f AAAnO'"*or\«cA i 
MUUU 2 - i - y J-/ ■ cU I 


ruflUf AH 


690 


7026 


RTA00002SS7F.b. 10. l.P.Seq 


F 


\A AAAA 1 1*2 "*"Q -All 

MUUUU 1 Jo D.All 




691 


5691 


RTA00002S95F.n. 13. l.P.Seq 


F 


\<fAAAAJ t t if-Pii 1 
MUUUU4 1 .UU 


THOl POH 




I J ly t 


dt AnoOfTQ 1SF i *>1 I P Sea 


F 


M0003291SD:B04 


CH08LNH 


693 


5187 


RTAOOOO:9:3F.n.03. l.P.Seq 


F 


M0003933SB:F07 


CH09LNL 


694 


186115 


RTAOOOO:912F.i.01. l.P.Seq 


F 


M00027376C:A02 


CH04MAL 


695 


4S26 


RTA0000:9l7F.e.24.1.P.Seg_ 


F 


M00032729A:F10 


CH0SLNH 


696 


6733 


RTA00OOZ9l7F.m.ll. l.P.Seq 


F 


M00032774C:C04 


CHOSLNTi 


697 


7604 


RTAOOQO:923F.j.05. l.P.Seq 


F 


M00039291D:F02 


CH09LNL 


698 


46459 


RT A00002905F.f .0 1 . 1 -P-Se^ 


F 


M0000S0:0D:F02 


CH03M.\H 


699 


23385 


RTA00002SS9F.i.23. l.P.Seq 


F 


MOO0O155iD:D01 


CH01COH 


700 


7516 


RTA0O0O:S91F.h.ll. l.P.Seq 


F 


M00003749C:C0S 


CH01COH 
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731 
732 
733 



185698 
24702 
12595 



RTAQ000291 lF.d.03.2.P.Seq 
RTA000O2S93F.L17.LP.se. 
RT AQ00029Q4F.C.06. 1 .P.Seq 



M00026836B:H03 
M00004360C:D09 
M00007197B:B05 




CH04MAL 
CH01COH 
CH02COH 



WO 01/02568 



PCT/US00/18374 



ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


751 


32293 


RTA0000290 1 F.i. 1 3. LP.Seq 


F 


M00005535B:B01 


CH02COH 


752 


8913 


RTA00002901F.j.07. LP.Seq 


F 


M0O0O5557D:H10 


CH02COH 


753 


I858I9 


RT A000029 1 2F.a.20. 1 .P.Seq 


F 


M00027215A:F06 


CH04MAL 


754 


10559 


RTA00002S98F.0. 12.1.P.Seq 


F 


M00004406A:G09 


CHOICOH 


755 


8740 


RTA00002923F.0. 1 1 . 1 .P.Seq 


F 


M000393S3A:H07 


CH09LNL 


756 


160257 


RTA00002907F.L 1 2.2.P.Seq 


F 


M00022237C:E04 


CH03MAH 


757 


6078 


RTA00002930F.C. 1 1. 1. P.Seq 


F 


M00055433D:G03 


CH15C0N 


758 


12543 


RTA00002927F.b. 14. 1 .P.Seq 


F 


M00039377B:E05 


CH12EDT 


759 


9686 


RTA00002930F.f. 19. 1 .P.Seq 


F 


M00055794A:EIO 


CH15C0N 


760 


3369 


RTA00002930F.b. 12. 1 .P.Seq 


F 


M00042732B:H06 


CH15C0N 


761 


6891 


RTA00002S95F.i.03.1. P.Seq 


F 


M000040S7C:E02 


CH01COH 


762 


13666 


RTA00002892F.L05. LP.Seq 


F 


M00003S22C.A09 


CH01COH 


763 


6925 


RTA00002930F.k.24.1. P.Seq 


F 


M0005645SC:EOl 


CH15C0N 


764 


11351 


RTA00002901F.g. 15. LP.Seq 


F 


M00005504D:F06 


CH02COH 


765 


11497 


RTA00002S89F.a.2 1 . 1 .P.Seq 


F 


M00001512D:F08 


CH01COH 


766 


1596 


RTA00002922F.m. 18. LP.Seq 


F 


M00039125D:H12 


CH09LNL 


767 


186519 


RTA00002924F.a.22. LP.Seq 


F 


M00039411D:D09 


CH09LNL 


763 


24429 


RTA00002903F.J.04. i .P.Seq 


F 


M000069S9B:G05 


CH02COH 


769 


33795 


RTA00002902F.k. 18. 1 .P.Seq 


F 


M00006739B:A04 


CH02COH 


770 


24267 


RTA00002889F.I.17. LP.Seq 


F 


M0000156ID:H04 


CHOICOH 


771 


12536 


RTA0000289 i F.j.20. 1 .P.Seq 


F 


M00003760C:GIO 


CHOICOH 


772 


22627 


RTA0O0O2887F.k.O7. LP.Seq 


F 


M00001410A:G10 


CHOICOH 


773 


24430 


RTA00002901F.h.20. LP.Seq 


F 


M00005520B:EOl 


. CH02COH 


774 


16151 


RTA00002897F.I.22. LP.Seq 


F 


M000042S4A:F08 


CHOICOH 


775 


6148 


RTA0O002S90F.L 16. LP.Seq 


F 


M00001623D:E12 


CHOICOH 


776 


106064 


RTA0000290SF.1. 19. LP.Seq 


F 


M000224S5B:E07 


CH03MAH 


777 


9573 


RTA00002S93F.p. 13. LP.Seq 


F 


M00003970D:H07 


CHOICOH 


778 


19542 


RTA0O0O2902F.I.2O. LP.Seq 


F 


M00006756B:G06 


CH02COH 


779 


16672 


RTA000O2SS9F.b.2 1 . 1 .P.Seq 


F 


M000015:SC:C03 


CHOICOH 


780 


8573 


RTA00002S91F.p.07. LP.Seq 


F 


M000037S5D:F07 


CHOICOH 


781 


15746 


RTA00002S96F.h. 10. LP.Seq 


F 


M0000416?C:A03 


CHOICOH 


782 


4500 


RTAO0OO2S87F.b.08. LP.Seq 


F 


M000013S"A:C12 


CHOICOH 


783 


16003 


RTA000029 lOF.c.08. LP.Seq 


F 


M0002282OA:F07 


CH03MAH 


784 


18723 


RTA000029l6F.g.l8.l.P.Seq 


F 


M000325SOD:A09 


CH08LNH 


785 


4270 . 


RTA00002922F.b.OL LP.Seq 


F 


M00038616C:C09 


CH09LNL 


786 


30095 


RTA0O0O2907F.i.2O. LP.Seq 


F 


M0002220SC:E04 


CH03MAH 


787 


42916 


RTA0O0O2924F.C.08. 1 .P.Seq 


F 


M0003943?B:D06 


CH09LNL 


788 


13652 


RTA00002902F.j.09. LP.Seq 


F 


M000067 UC :D06 


CH02COH 


789 


6972 


RTA00002902F.J. 06. LP.Seq 


F 


M00006712C:H01 


CH02COH 


790 


4519 


RTA000029 1 OF.i.06. 1 .P.Seq 


F 


M0002294"B:D02 


CH03MAH 


791 


13106 


RTAOO0O292SF.f.09. LP.Seq 


F 


. M00040224C:F06 


CH13EDT 


792 


98186 


RTA00002909F.rn.08. 1 -P.Seq 


F 


M000226963:Cll 


CH03M.AH 


793 


3167 


RTA0O0O2S98F.g.O9. LP.Seq 


F 


M0000434^D:C12 


CHOICOH 


794 


3272 


RTA00002897F.a. 1 S. 1 .P.Seq 


F 


M000042L:D:C0j 


CHOICOH 


795 


14446 


RTA0O0O2S99F.d.05. LP.Seq 


F 


M00004462D:D12 


CHOICOH 


796 


17865 


RTA000029 I SF.a. 13. LP.Seq 


F 


M00032S25B:F08 


CH08LNH 


797 


5834 


RTA00002S98F.h. 12. LP.Seq 


F 


M00004352A:D08 


CHOICOH 


798 


14533 


RTA00002S96F.k.24. LP.Seq 


F 


M0000417?C:B06 


CHOICOH 


799 


15222 


HTA00002900F.j.05. LP.Seq 


F 


M00005332A:C06 


CH02COH 


800 


22594 


RTA00002S9SF.h.2L LP.Seq 


F 


M0000435"3:B06 


CHOICOH 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME ( 


DRIENTATION 


CLONE ID 


LIBRARY 


801 

802 

803 

804 

805 

806 

807 

808 

809 

810 

81L 

812 

813 

814 

815 

816 

817 

813 

819 

820 

821 

822 

823 

824 

8? 5 

826 

827 

828 

829 

830 

831 

832 

833 

834 

835 

836 

837 

838 

839 

840 

841 

842 

841 

844 

845 

846 

847 

848 

849 

850 


9204 
186464 
5441 
32544 
15351 
13129 
186376 
17816 

8434 
22146 
31912 

1487 

24777 . 
144483 

6546 

5984 

24441 

20889 

127721 

20684 

30095 
6763 
6763 

48725 

21260 

42572 
3441 

21419 
8004 

185870 

24580 
5153 
8653 

23799 
11012 

46592 
6650 
16618 
18274 
20694 
9493 
6132 

IS6259 
3769 
365S4 
38077 
3927 
4275 
12554 
13761 


RTA00002890F.h.2Q. l.P.Seq 
RTA000029UF.d.09.2.P.Seq " 
RT A00002900F.a. II . 1 .P.Seq 
RTA00002893F.1.21. l.P.Seq 
RTA00002915F.j. 15. l.P.Seq 
RT A00002898F.a. 12. 1 .P.Seq 
RTAO0OO2912F.k.2 1.1. P.Seq 
RTA00002901F.O.04. l.P.Seq 
RTA00002923F.1.22. l.P.Seq 
RTA00002922F.L08. l.P.Seq 
RT A00002904F.a. 14. 1 .P.Seq 
RTA00002925F.n.03. 1 .P.Seq 
RTA00002900F.n.02. l.P.Seq 
RTA00002902F,d.01 . 1 .P.Seq 
RTA00002935F.p. 16. l.P.Seq 
RTA000O2935F.p.O9. 1 .P.Seq 
RTA00002900F.a.22. l.P.Seq 
RTA00002935F.h.09.1.P.Seq 
RTA000029 15F.C. 18.1 .P.Seq 
RTA00002900F.C.03.1. P.Seq 
RTA00002907F.i.20.2.P.Seq 
RTA00002S92F.o.01.2.P.Seq 
RTA00002892F.n.24.2.P.Seq 
RTA00002907F.!.22.2.P.Seq 
RTA00002935F.C.22. 1 .P.Seq 
RT A00002930F.C.2 1 . 1 .P.Seq 
RT A0OO02935F.i . 1 3 . 1 .P. Seq 
RTA00002930F.b. 13.1 .P.Seq 
RTA000029 lOF.b.08. 1 .P.Seq 
RTA000029 12F.C.06. 1 .P.Seq 
RT A00002930F.d.0 1 . 1 .P.Seq 
RT A00002930F.b. 16. 1 .P.Seq 
RTA00002895F.f. 17. l.P.Seq 
RTA00002924F.1.23. 1 .P.Seq 
RTA00002930F.j.09. 1 .P.Seq 
RTA00002900F.b. 19.1 .P.Seq 
RT A0000290SF.m. 12. 1 .P.Seq 
RTA000028S9F.n.l8. l.P.Seq 
RTA00002889F.2.05. l.P.Seq 
RT A0000290SF. h .08 . 1 . P .Seq 
RTA00002909F.m. 1 1 . 1 .P.Seq 
RTA00002S97F.C.04. 1 .P.Seq 
RTAuOOU-ly i.r.m. l i r.^ec 
RT A000029 1 6F.2.22. i .P.Seq 
RT A000029 3 5 F . t\ 1 2 . 1 .P.Seq 
RTA00002890F.e.06. l.P.Seq 
RT A00002935F.a. 12. 1 .P.Seq 
RTA00002914F.b. 16. l.P.Seq 
RTA0000292 1 F.a.23. l.P.Seq 
RTA00002901F.f.22. l.P.Seq 


F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 
F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 


MOQOO 16 19C:H09 

M00026842D:C02 

M00004824D:H05 

M00003935B:BOl 

M00032471D:A05 

M00004310B:E02 

M00027485CF07 

M00005674C:F04 

M00039326C:B08 

M00039067B:F07 

M00007154A:E06 

M00040016CE07 

M00005380B:H10 

M00006577A:H10 

M00055425C:A04 

M00055420A:E06 

M00004S32D:G04 

M00054S07D:Cll 

M00023763A:Gll 

M00004843A:G12 

M0002220SC:E04 

M00003845D:G03 

M0000384fD:G03 

M00022240B:C12 

M00054499A:C0S 

M00055454A:D02 

M00054S90C:D05 

M00042734A:F05 

M00022S053:A10 

M00027247C:D02 

M00055466A:F06 

M00042743D:G10 

M000040SOC:C04 

M0003969SC:B03 

M00056215D:F02 

M00004339B:C12 

M00022491D:A10 

M0000156SC:A03 

M00001543C:AOS 

M00022442B:G03 

M0002269SC:DIO 

M00004220D:Cll 

M000275:?B:C05 

M000325SIB:A09 

M000546S3D:GU 

M000016053:305 

M00042516B:DOI 

M0002S0o3C:H0'l 

M0003330:A:Ell 

M000054895:COS 


CHQICQH _ 
CH04MAL 
CH02COH 
CHOlCOH 
CH03LNH 
CHOlCOH 
CH04MAL 
CH02COH 
CH09LNL 
CH09LNL 
CH02COH 
CH09LNL 
CH02COH 
CH02COH 
CH17COHLV 
CH17COHLV 

CH02COH 
CH17COHLV 
CH08LNH 
CH02COH 
CH03MAH 
CHOlCOH 
CHOlCOH 
CH03MAH 
CH17COHLV 

CH15CON 
CHL7COHLV 
CH15CON 
CH03MAH 
CH04MAL 
CH15CON 
CH15CON 
CHOlCOH 
CH09LNL 
CH15CON 
CH02COH 
CH03MAH 
CHOlCOH 
CHOlCOH 
CH03MAH 
CH03M.AH 
CHOlCOH 
CH04M.AL 
CHOSLNH 
CH17COHLV 

CHOlCOH 
CH17COHLV 
CHOSLNH 
CH09LNL 
CH02COH 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBR.\RY 


851 


19059 


RTA00002S97F.C.22. 1 .P.Seq 


F 


M00004237C:D10 


CHOICOH 


852 


22944 


RTA00002935F.b. l7.I.P.Seq 


F 


M00O43355A:D07 


CH17C0HLV 


853 


2189 


RTA00002925F.J. 06. l.P.Seq 


F 


M00039921A.BIO 


CH09LNL 


854 


19153 


RTA00002S92F.h.04.2.P.Seq 


F 


M00003819B:B01 


CHOICOH 


855 


1833 


RTA00002890F.e. 13.1 .P.Seq 


F 


M00001606B:A10 


CHOICOH 


856 


18447 


RTA00002935F.d.23. l.P.Seq 


F 


M00054569A:B07 


CH17C0HLV 


857 


2461 


RTA00002922F.b.08. 1 .P.Seq 


F 


M0003S619B:F09 


CH09LNL 


858 


15917 


RTA00002896F.J.06. l.P.Seq 


F 


M00004172CA08 


CHOICOH 


859 


9379 


RTA000O2935F.a. 15. l.P.Seq 


F 


M00043299A:B10 


CH17C0HLV 


860 


5511 


RTA00002931F.b.06. l.P.Seq 


F 


M00042796A:A10 


CHI6C0P 


861 


10540 


RTA00002891F.k.l6.1.P.Seq 


F 


M00003764B:HI1 


CHOICOH 


862 


12117 


RTA00002899F.a.09. l.P.Seq 


F 


M00004419A:G02 


CHOICOH 


863 


8777 . 


RTA00002919F.a.23. l.P.Seq 


F 


MOOO33028D:C10 


CH08LNH 


864 


23972 


RTA00002900F.0. 18. l.P.Seq 


F 


M00005403C:A01 


CH02COH 


865 


17005 


RTA00002896F.m. 10. l.P.Seq 


F 


M000041S7B:C02 


CHOICOH 


866 


1085 


RTA00002924F.1.20. l.P.Seq 


F 


M00039694C:HOl 


CH09LNL 


867 


4270 


RTA00002922F.a.24. 1 .P.Seq 


F 


M00038616C:C09 


CH09LNL 


868 


4609 


RTA00002935F.e. 15.1 .P.Seq 


F 


M0O054599D:B03 


CH17C0HLV 


869 


6889 


RTA000029 19F.C.07. l.P.Seq 


F 


M00033037B:F04 


CH08LNH 


870 


15228 


RTA000029 1 9F.e.06. 1 .P.Seq 


F 


M00033055D:D02 


CH08LNH 


871 


20971 


RTA00002904F.a,22. l.P.Seq 


F 


M0000715SD:D03 


CH02COH 


872 


5174 


RTA0O0O2935F.a.23. l.P.Seq 


F 


M00043313D:E09 


CH17C0HLV 


873 


15236 


RTA0000292SF.e. 16. 1 .P.Seq 


F 


M0004019SA:F12 


CH13EDT 


874 


9223 


RTA00002S96F.b. 15. l.P.Seq 


F 


M00004141A:D0l 


CHOICOH 


875 


24591 


RTA00002923F.g.l0.1.P.Seq 


F 


M00039251C:H12 


CH09LNL 


876 


36306 


RTA00002888F.1.11. l.P.Seq 


F 


M000014S5C:F06 


CHOICOH 


877 


3309 


RTA00002893FJ.2 1. l.P.Seq 


F 


M00003916A:E04 


CHOICOH 


878 


186712 


RTA0000291 lFx. 1 l.2.P.Seq 


F 


M00026S09A:H08 


CH04MAL 


879 


9090 


RTA00002S91F.S.23. l.P.Seq 


F 


M00003746C:Ell 


CHOICOH 


880 


11510 


RTA00002SS8F.i.07. l.P.Seq 


F 


M0000U67CD04 


CHOICOH 


SSI 


9784 


RTA00002889F. j. 15.1 .P.Seq 


F 


M00001554C:G10 


CHOICOH 


882 


25618 


RTA00002930F.a. 11 . 1 .P.Seq 


F 


M00042554A:D01 


CHI SCON 


883 


12493 


RTA00002928F.i. 11. l.P.Seq 


F 


M000402S9D:C06 


CH13EDT 


884 


24361 


RTA00002933F.b.08. l.P.Seq 


F 


M00043134A:F05 


CH19C0P 


885 


' 12449 


RTA00002930F.d.02. 1 .P.Seq 


F 


M000-55468A:A08 


CH15C0N 


886 


17894 


RTA00002929Fa.04. l.P.Seq 


F 


M00039747B:B06 


CH14EDT 


887 


13204 


RTA00002930F.f.09. l.P.Seq 


F 


M0O055745B:AO8 


CH15C0N 


888 


32119 


RTA00002930F.C. 19. 1 .P.Seq 


F 


M0005544SB:E05 


CH15C0N' 


889 


5909 


RTA00002935F.L23. l.P.Seq 


F 


M00054931D:E10 


CH17C0HLV 


890 


24453 


RTA00002927F.d.l5. 1 .P.Seq 


F 


MO0O39526A:AO8 


CH12EDT 


891 


46982 


RTA00002935F.k.22. l.P.Seq 


F 


M00055093B:A03 


CH17C0HLV 


892 


43888 


RTA00002932F.b.23. 1 .P.Seq 


F 


M000430"0A:C03 


CH1SC0N 


893 


24580 


RTA00002930F.C.24. 1 .P.Seq 


F 


M000554o6A:F06 


CHI SCON 


894 


186495 


RT A00002927F.a.2 1 . 1 .P.Seq 


F 


M000393o4D:E05 


CH12EDT 


895 


12420 


RTA0OOO2932F.5.2 1 . 1 .P.Seq 


F 


MO0043Ob3C:H05 


CH18C0N 


896 


3833 


RTA00002916F.C. 14. l.P.Seq 


F 


MOO03:562C:F01 


CHOSLN"H 


897 


10438 


RTA00002930F.j. 13. l.P.Seq 


F 


MO0056230D:E07 


CHI SCON 


898 


12367 


RTA00002922F.n. 10. l.P.Seq 


F 


M00039133B:D06 


CH09LNL 


899 


5012 


RT A00002930F.k. 16. 1 .P.Seq 


F 


M0005o342A:C03 


CHI SCON 


900 


6458 


RTA00002929F.C.2 1 . 1 .P.Seq 


F 


M00040291A:G10 


CHUEDT 



WO 01/02568 



PCTYUS00/18374 



ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ED 


LIBRARY 


901 


16507 


RTA00G02929F.d.0S. l.P.Seq 


F 


M00040298B:B09 


CH14EDT 


902 


13914 


RTA00002922F.h. 1 8. l.P.Seq 


F 


M00039063C:H09 


CH09LNL 


903 


11590 


RTA00002930F.k. 18. l.P.Seq 


F 


M00056436CF01 


CH15C0N 


904 


15380 


RTA00002928F.L06. l.P.Scq 


F 


M00040287A:Cll 


CH13EDT 


905 


10190 


RTA00002895F.k.07. l.P.Seq 


F 


M00004096D:F02 


CH01COH 


906 


12593 


RTA00002934F.a. 12. l.P.Seq 


F 


M00043485CC03 


CH20COHLV 


907 


112813 


RTA00002905F.0. 14. 1 .P.Seq 


F 


M00021677A:D09 


CH03iVtAH 


908 


15929 


RTA00002930F.j. 18. l.P.Seq 


F 


M00056244C:H05 


CH15C0N 


909 


16670 


RTA00002935F.0. 16. l.P.Seq 


F 


M00055387CC12 


CHI7C0HLV 


910 


10924 


RTAO0002907F.k. 12.2.P.Seq 


F 


M00022224A;C07 


CH03MAH 


911 


6233 


RTA00002896F.b. 17. 1 .P.Seq 


F 


M00004141B:BOl 


CH01COH 


912 


14777 


RTA000O2897F.k.09. l.P.Seq 


F 


M00004277D:B02 


CH01COH 


913 


12797 


RTA00002935F.h.0 1 . 1 .P.Seq 


F 


M000547S1D:A11 


CH17C0HLV 


914 


186041 


RTA000029 12F.C.0 1. l.P.Seq 


F 


M00027244CB06 


CH04MAL 


915 


8182 


RTA0000293 lF.a.22. l.P.Seq 


F 


M00042766CD05 


CH16C0P 


916 


23088 


RTA00002888F.p.20. 1 .P.Seq 


F 


M00001506B;D11 


CHOICOH 


917 


24298 


RTA00002935F.p.lS.i.P.Seq 


F 


M00055473OF02 


CHI7C0HLV 


918 


40621 


RTA00002S96F.k. 12. l.P.Seq 


F 


M00004176C:A09 


CHOICOH 


919 


7124 


RTA00002935F.b.07. 1 .P.Seq 


F 


M00043328CE04 


CH17C0HLV 


920 


21107 


RTA0000290 1 F.i .02. l.P.Seq 


F 


M00005524C:H04 


CH02COH 


921 


10807 


RTA00002928F.C. 1 5. 1 .P.Seq 


F 


M00040162A:E02 


CH13EDT 


922 


12162 


RTA000029l5F.j.23.1.P.Seq 


F 


M00032475A:A06 


CH08LNH 


923 


14747 


RTA0000293 1 F.a. 1 S. 1 .P.Seq 


F 


M00042512D:D10 


CH16C0P 


924 


6824 


RTA0000293 1 F.b.23. 1 .P.Seq 


F 


M00042S57CE01 


CH16C0P 


925 


39115 


RTA00002932F.a. 17. l.P.Seq 


F 


M00042967D:C01 


CHI SCON 


926 


9484 


RTA00002934F.a. 13. l.P.Seq 


F 


M00043490C:F02 


CH20COHLV 


927 


77981 


RTA00002S90F.j.21.1.P.Seq 


F 


M00001633D:C11 


CHOICOH 


928 


16061 


RTA00002932F.b.24. 1 .P.Seq 


F 


M00043113CG09 


CHI SCON 


929 


4834 


RTA00002930F.d. 12. l.P.Seq 


F 


M00055f27B:E01 


CH15C0N 


930 


9427 


RTA00002935F.e.21. l.P.Seq 


F 


M00054623C:F05 


CHI7C0HLV 


931 


167736 


RTA00002935F.L 11. l.P.Seq 


F 


M00055117A:E02 


CH17C0HLV 


932 


16524 


RTA00002935F.b. 19. l.P.Seq 


F 


M00043358C:A02 


CH17C0HLV 


933 


23496 


RTA000O2932F.a.21. l.P.Seq 


F 


M000429"6D:C01 


CHI SCON 


934 


163647 


RT A0000290 IF. m. 17. l.P.Seq 


F 


M00005634A:F07 


CH02COH 


935 


14239 


RT A0000293 1 F.b.07. 1 .P.Seq 


F 


M00042S01C:D01 


CH16C0P 


936 


25574 


RTA00002386F.S.20. l.P.Seq 


F 


M00001 353C:A05 


CHOICOH 


937 


2737 


RT A00002932F.3.0S. 1 .P.Seq 


F 


M000425SSCE02 


CH18C0N 


938 


6925 


RT A00002930F.1 .0 1 . 1 .P.Seq 


F 


M00056458C:E01 


CHL5C0N 


939 


21106 


RTA00002925F.p. 16. l.P.Seq 


F 


M00040045B:H07 


CH09LNL 


940 


28134 


RTA00002917F.k.23. l.P.Seq 


F 


M00032765A:C05 


CH08LNH 


941 


186496 


RTA00002917F.1. 16. l.P.Seq 


F 


M000327"0C:G11 


CH08LNH 


942 


21625 


RTA00002S93F.j. IS. l.P.Seq 


F 


M00003^15C:G0S 


CHOICOH 


943 


12537 


RTA00002930F.2.23. l.P.Seq 


F 


M00055919B:CI0 


CH15C0N 


944 


15577 


RTA000029Q2F.f.l5. l.P.Seq 


F 


M000Obc56B:EO4 


CH02COH 


945 


6106 


RTA0000Z935F.rM 1.1. P.Seq 


F 


M00054cS2B:H02 


CH17C0HLV 


946 


17136 


RTA00002935F.h. 14. L.P.Seq 


F 


M00054S1SB:F10 


CH17C0HLV 


947 


2582 


RTA00002921F.1. 14. l.P.Seq 


F 


M000334i3A:A08 


CH09LNL 


948 


16638 


RTA00002S99F.a. 15. l.P.Seq 


F 


M00004420D:E05 


CHOICOH 


949 


8869 


RT A00002929F.a.05. 1 .P.Seq 


F 


M00039™4SC:G09 


CH14EDT 


950 


14426 


RTA000029UF.C. 12. l.P.Seq 


F 


M0002S0o9D:H02 


CHOSLNH 



WO 01/02568 



PCT/USOO/18374 



ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LfBR.ARY 


951 


11994 


RTA00002S90F.h.l4.l.P.Sea 


F 


M0OO01617CFIO 


CH01COH 


952 


186664 


RTA00002932F.a.05. l.P.Sea 


F 


M0OO42585D:D03 


CHI SCON 


953 


162235 


RTA00002907F.j.06.2.P.Seq 


F 


M0OO22212D:G02 


CH03MAH 


954 


2127 


RTA000029 12F.0. 14. LP.Seq 


F 


M000276053:D09 


CH04MAL 


955 


41014 


RT A0000290 1 F.n.04. 1 .P.Seq 


F 


M00005641B:E09 


CH02COH 


956 


17636 


RTA00O02933F.C. 19. 1 .P.Seq 


F 


M0O043222CB06 


CH19COP 


957 


2328 


RTA00OO2935F.e.05. LP.Seq 


F 


M00054579A:C02 


CH17COHLV 


958 


15414 


RTA00002935F.p. 13. 1. P.Seq 


F 


' M00055423C.H10 


CH17COHLV 


959 


11948 


RTA00002895F.O.01. l.P.Scq 


F 


M00004118CD12 


CH01COH 


960 


24759 


RTA00002903F.n.05. l.P.Seq 


F 


M0O0O70S2D:E05 


CH02COH 


961 


15152 


RT A00002925F..2.0 1 . 1 .P.Seq 


F 


M0OO39873B:H04 


CH09LNL 


962 


14917 


RTA00002922F.b.02. l.P.Seq 


F 


M00038616D:B07 


CH09LNL 


963 


12941 • 


RTA00OO2S89F.C. 15. 1 .P.Seq 


F 


M00001532A:G08 


CHOICOH 


964 


29676 


RTA0000293 iF.b.03. 1 .P.Seq 


F 


M00042788A:F04 


CH16C0P 


965 


17789 


RTA00002S9IF.a.2UvP.Seq 


F 


M00001671A:H10 


CHOICOH 


966 


45097 


RTA00002928F.g.06. LP.Seq 


F 


M00040247D:D02 


CH13EDT 


967 


18407 


RT A00002909F.b. 11.1 .P.Seq 


F 


M00022546B:E05 


CH03MAH 


968 


22309 


RTA00002900F.n. 19. l.P.Seq 


F 


M00005392A:G06 


CH02COH 


969 


109382 


RTA00002907F.k. 13. l.P.Seq 


F 


M00022224A:G07 


CH03MAH 


970 


92273 


RTA00002909F,j. 17. LP.Seq 


F 


M00022662D:H03 


CH03MAH 


971 


8403 


RTA000029 15F.j.22. i .P.Seq 


F 


M00032474A:G03 


CH08LNH 


972 


7763 


RTA00O02928F.h. 10. 1 .P.Seq 


F 


M00040267D:A12 


CH13EDT 


973 


13470 


RT A00002930F.k.09. 1 .P.Seq 


F 


M00056304A:H05 


CH15C0N 


974 


1484 


RTA0000292 IF.k.lO. LP.Seq 


F 


M00033556D:C10 


CH09LNL 


975 


10345 


RTA00002892F.0. 19.2.P.Seq 


F 


M0000384SC:G09 


CH01COH 


976 


17242 


RTA0000293 lF.a.05. LP.Seq 


F 


M00042433A:E11 


CH16C0P 


977 


171180 


RTA00002909F.L24. 1 .P.Seq 


F 


M0002261SB:D09 


CH03MAH 


978 


16790 


RTA000029 14F.C.03. 1 .P.Seq 


F 


M00028067A:Cil 


CHOSLNH 


979 


139516 


RTAOOOO29O3F.1.02. LP.Seq 


F 


MO0O07032C:A12 


CH02COH 


980 


4825 


RTA00002930F.b. 15. LP.Seq 


F 


M00042742B:E04 


CH15C0N 


981 


8830 


RTA00OO2930F.a.05. LP.Seq 


F 


M00042528C:H01 


CH15C0N 


982 


12398 


RTA00002935F.0. 19, LP.Seq 


F 


M0OO55391B:C07 


CH17C0HLV 


983 


17867 


RTA00002900F.C. 14. LP.Seq 


F 


M00004S50A:B02 


CH02COH 


984 


15796 


RTA00002935F.b. 12. LP.Seq 


F 


M00043339C:Fli 


CH17C0HLV 


985 


185669 


RTA00002935F.f. 13. l.P.Seq 


F 


M000546S6A:A09 


CH17C0HLV 


986 


13638 


RTA00002935F. j.20. 1 .P.Seq 


F 


M00055002B:E0S 


CH17C0HLV 


987 


8280 


RTA00002930F.e. 12. LP.Seq 


F 


M00055653C:B07 


CH15C0N 


988 


12632 . 


RT A0000293 1 F.c.03. LP.Seq 


F 


M00042S60B:C07 


CH16C0P 


989 


7620 


RTA00002935F.m. IS. LP.Seq 


F 


M00055240A.-A08 


CH17C0HLV 


990 


23922 


RTA0OOO2935F.m.20. l.P.Seq 


F 


M00055244B:F07 


CH17C0HLV 


991 


43864 


RTA00002931F.b.05. LP.Seq 


F 


M00042794A:F01 


CH16C0P 


992 


34478 


RTA00002929F.B. 13. LP.Seq 


F 


M00040367A:C08 


CH14EDT 


993 


6861 


RTA00002933F.C. 17. LP.Seq 


F 


M00043221D:C12 


CH19C0P 


994 


13971 


RTA00OO2933F.b.0L LP.Seq 


F 


M00043099A:H04 


CH19C0P 


995 


13971 


RTA00002933F.a.24. LP.Seq 


F 


M00043099A:H04 


CH19C0P 


996 


13244 


RTA00OO2927F.e.0S. LP.Seq 


F 


M00039537A:F0S 


CH12EDT 


997 


7455 


RTA00002935F.d. 1 L LP.Seq 


F 


MOOO545:SB:E05 


CH17C0HLV 


998 


18915 


RT A00O02929F.b.2 L LP.Seq 


F 


MO00402OlA:H0l 


CH14EDT 


999 


4023 


RTAOOOO2935F.h.03. l.P.Seq 


F 


M000547S6C:D0S 


CH17C0HLV 


1000 


10785 


RTA0OOO2933F.a. 11. LP.Seq 


F 


M00043074CD07 


CH19C0P 




WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


1001 


14851 


RTA0000292SF.i.O7. 1 .P.Seq 


F 


M000402S7CF10 


CH13EDT 


1002 


109382 


RTA00002907F.k. 13.2.P.Seq 


F 


M00022224A:G07 


CH03MAH 


1003 


23878 


RTA00002933F.b.03.1. P.Seq 


F 


M00043101D:Gll 


CH19C0P 


1004 


27516 


RTA00002927F.f. 17. l.P.Seq 


F 


M0003959SA:E04 


CH12EDT 


1005 


9652 


RTA00002931F.C.04.1. P.Seq 


F 


M00042863D:F09 


CH16C0P 


1006 


24729 


RTA00002931F.a. 12.1. P.Seq 


F 


M00042462B:C02 


CH16C0P 


1007 


186041 


RTA000029 12F.b.24. 1 P.Seq 


F 


M00027244CB06 


CH04MAL 


1008 


12282 


RTA00002935F.i. 1 8. 1 .P.Seq 


F 


M0005490SCA01 


CH17C0HLV 


1009 


10704 


RTA00002930F.f. 12. 1 .P.Seq 


F 


M00055757A:BOl 


CH15C0N 


1010 


3397 


RTA00002930F.h.08. 1 .P.Seq 


F 


M00055975B:F09 


CH15C0N 


1011 


35256 


RTA00002886F.m. 16.1. P.Seq 


F 


M00001374A:B02 


CH01COH 


1012 


1448 


RTA00002900F.g.05.l.P.Seq 


F 


M00005002A:C03 


CH02COH 


1013 


1259 


RTA00002922F.n.08.1. P.Seq 


F 


M00039131CB09 


CH09LNL 


1014 


16903 


RTA00002935F.a.01.1. P.Seq 


F 


M00042352B:A04 


CH17C0HLV 


1015 


7884 


RTA00002922F.a. 13. l.P.Seq 


F 


M00038390B:F02 
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1175 


10762 


RTA00002917F.O.08. l.P.Seq 


F 


M0OO32793A:GO6 


CH08LNH 


1176 


23170 


RTA000028S7F.f. 15. l.P.Seq 


F 


M00001396B:B01 


CHOICOH 


1177 


8487 


RT A00002887F. f. 1 6. 1 .P.Seq 


F 


M00001396B:B12 


CHOICOH 


1178 


185798 


RTA000029 1 lF.k.06. 1 .P.Seq 


F 


M00O2705OA:BO2 


CH04M.AL 


1179 


8976 


RTA00002S96F.h,03. 1 .P.Seq 


F 


M00004L6!B:G07 


CHOICOH 


1180 


12159 


RTA000O293OF.C.01. l.P.Seq 


F 


M00042S9^C:A11 


CH15C0N 


1131 


7788 


RTA00002932F.b. 13. l.P.Seq 


F 


M000430TCD08 


CH18C0N 


1182 


43336 


RT A000029 1 7F.d.09. 1 .P.Seq 


F 


M0O0326SSGAO3 


CHOSLNH 


1183 


10313 


RTA00002902F.k. 19. l.P.Seq 


F 


M00006740B:A09 


CH02COH 


1184 


. 4588 


RTA00002S91F.O.11. l.P.Seq 


F 


M000037S2A:B02 


CHOICOH 


1185 


18090 


RTA00002925F.il 7. 1 .P.Seq 


F 


M000399S1D:B01 


CH09LNL 


1186 


185994 


RTA0000291 lF.p.07. LP.Seq 


F 


M00027177B:D04 


CH04M.AL 


1187 


166276 


RT A00002908F.h.03. 1 .P.Seq 


F 


M0002243SCH09 


CH03MAH 


118S 


15984 


RTA00002932F.a. 10. 1 .P.Seq 


F 


M00042621C:C04 


CHI SCON 


1189 


13242 


RTA000028S9F.L 1 1 . 1 .P.Seq 


F 


M00001550D:B11 


CHOICOH 


1190 


6840 


RTA00002935F.i.06. 1 .P.Seq 


F 


M00054Sc6B:C08 


CH17C0HLV 


1191 


17265 


RTA00002935F.1.04. 1 .P.Seq 


F 


M00055lOSB:A02 


CH17C0HLV 


1192 


12542 


RTA00002933F.b. 17. l.P.Seq 


F 


M00043152CB10 


CH19C0P 


1193 


1568 


RTA00002928F.d. 10. 1 .P.Seq 


F 


M00040r4D:G06 


CH13EDT 


1194 


8721 


RTA00002901F.k.23. 1 .P.Seq 


F 


M00005606D:B12 


CH02COH 


1195 


13519 


RTA00002S9SF.j. 19. [.P.Seq 


F 


M000043oSA:Bll 


CHOICOH 


1196 


4471 


RTA00002S90F.d. 14. l.P.Seq 


F 


M00001600B:G01 


CHOICOH 


1197 


11357 


RTA00002931F.C.06. l.P.Seq 


F 


M00042S"3D:F05 


CH16C0P 


1198 


11804 


RTA00002935F.rn.l7. l.P.Seq 


F 


M00055:39D:Fn 


CH17C0HLV 


1199 


6999 


RT A0O002S96F.C.2 1 . 1 .P.Seq 


F 


M00004146A:C11 


CHOICOH 


1200 


4408 


RT A00002S97F.a.02. l.P.Seq 


F 


M0000420"C:A04 


CHOICOH 



WO 01/02568 



PCT/US00/18374 



ID 


CLUSTER 


SEQ NAME < 


ORIENTATION 


CLONE ID 


LIBRARY 


1201 


4618 


RTA00002935F.j.2 1. l.P.Seq 


F 


M00055004CH05 


CH17COHLV 


1202 


185841 


RTA000029 1 2F.O.08. 1 .P.Seq 


F 


M00027600B:C07 


CH04MAL 


1203 


1278 


t\ *t" . rinAAi c\ i otr i i \ 1 D Can 
RTAUUUU-y lir.J. 1 j.i.r.jeq _ 


F 


M00027459CB10 


CH04MAL 


1204 


19677 


RTA00002929F.h. 1 6. 1 .P.Seq 


F 


M000403S-iB:E04 


CH14EDT 


1205 


17539 


RTA00002909F.C.06. 1 .P.Seq 


F 


M00022563B:C0S 


CH03MAH 


1206 
1207 


11390 
10735 


RTA00002392F.k.03.l.P.Seq 
RTAuUUU-iviDr.L.Ul. l.r.oeq 


F 
F 


M00003830B:C06 
M00039822A:H02 


CHOICOH 
CH09LNL 


1208 
1209 
1210 
1211 


3239 
181718 
6957 
23673 


RTA000028S7F.i.07. l.P.Seq 
RTA00002905F.n.20. l.P.Seq 
RTA000029 17F.n. 18. 1. P.Seq 


F 
F 
F 
F 


M00001406D:F06 
M00021663D:A03 
M00032787D:C05 
M00042557D:B06 


CHOICOH 
CH03MAH 

f* T TAO T V(TJ 

CH08LNH 
CH15CON 


1212 
1213 
1214 
1215 


11405 
10256 • 
25563 
2669 


RTA000029 ISF.b- 13- L .P.Seq 
RTA00002888F.d. 16. l.P.Seq 
R 1 AUUUU.-SV lr .P.ZJ. t.r.ocq 

RTAO0OO2886F.1.03. 1. P.Seq 


F 
F 
F 
F 


M00032831A:E09 
M00001449B:H10 
M00001675B:D06 
M00001368A.B07 


CH08LNH 
CHOlLOrl 
CHOICOH 
CHOICOH 


1216 
1217 


185877 
1314 


RT A000029 1 LF.b. 1 l.2.P.Seq 
RTAuUUUiyjUr.c.Uo. i.r.oeq 


F 
F 
F 


M00023389A:G04 
M000549tlD:E06 
M00022509A:H02 


CH04MAL 
CH15C0N 
CH03MAH 


1218 
1219 


25843 
1794 


RTA00002908F.O.05. l.P.Seq 
RTA00002924F.js.06. 1. P.Seq 


F 


M00039560C:G06 


CH09LNI. 


1220 


22038 


RTA00002S96F.a.20.1. P.Seq 


F 
F 


M00004136C:BI2 
M0003947SC:B02 


CHOICOH 
CH09LNL 


1221 
1222 


6011 
41087 


RTA00002924F.f. 12. l.P.Seq 

n*r \ aaaaooa l c ^ aa 1 P ^»»n 
R 1 AUUtlU-iVU L r .O.UO. l .r.oeq 


F 


M00005675D:D09 


CH02COH 


1773 
1224 
1225 


18534 
1444 
1078 


RT A00002908 F . o . 16. 1 .P.Seq 
RTA00002922F.a. 12. l.P.Seq 
RTA00002893F.P.24. 1 .P.Seq 


F 
F 
F 


M00022512B:A09 
M00038389D-.D10 
M00003972C:F07 


CH03MAH 
CH09LNL 
CHOICOH 


1226 
1227 
1228 
1229 
1230 


8632 
105042 
6878 
23639 
19635 


RTA00002896F.h.09. l.P.Seq 
RT A000029 17F.0. 1 LI -P.Seq 
RTA00002912F.L1 LI. P.Seq 

nT * r>rtAniO i TIT ; A^ t P ^.i*n 

RTAuUUUIV Lir.i.UJ. i.r.oeq 

r»T" \ AAAA1QOAC a l> 1 P 


F 
F 
F 
F 
F 


M00004163B:C03 
M00032795C:A03 
M00027513D:F06 
M00027381B:B04 
M00004144D:B02 


CHOICOH 

/*"T tact \ru 

CH0SLNH 
CH04MAL 
CH04MAL 
CHOICOH 


1231 
1232 
1233 
1234 


7217 
4930 
16945 
24790 


t>t \ aaaaiq^AP A 1 Q P ^r*n 

RTAUUUL)_yjUrg.U4. i.r .ocq 

dt \ nnr^n'^ i P r» n ! t P S^n 
K 1 AUUUU^V- ir,u.Ui.i.r.oc4 

dt \ AAAA^QOfTF * 1 1 IP 


F 
F 
F 
F 
F 


M00040094B:COS 
M00055810C:D03 
M00038290A:D12 
M00001606D:D06 
M00042586A:B0l 


CH09LNL 
CH15CON 
CH09LNL 
CHOICOH 
CHI SCON 


1235 
1236 
1237 


22721 
14861 
2452 


RTA00002932F.a.07. 1 .P.Seq 
RT A0000290 1F.2.18.1 .P.Seq 

r»*T \ nAAA"»0'> l C h fP t P ^**n 


F 
F 


M00005505B:E01 
M00033302B:F10 


CHUiLUH 
CH09LNL 


1238 


19269 


d*t \ aaaa"»*3Q7P n 1 1 1 P *\i»n 
K 1 AUUUU_i>o /r.p. 1 1, i .r.ocq 


F 


M00001430B:C01 


CHOICOH 


1239 


16029 


RTAOOUU-VjQr.j.l /.I.r.oeq 


F 


M00056244A:B06 


CHI SCON 


1240 


3038 


r»-r v AAnmnnC _ A/t 1 p Can 


F 


M00039121D-.E07 


CH09LNL 


1241 


2933 


o t" \ aaaatgoic o in 1 P ^PH 
R 1 AUUUU_y--r.2.-U. i. r.oeq 


F 


M00039056B:G01 


CH09LNL 


1242 




dt Annno^soi F i n 3 1 P Sea 


F 


MO00O3761B:B02 


CHOICOH 


1243 


15524 


RTA00002930F.fl.09. l.P.Seq 


F 
F 


M00055S18B:D01 
MO0054725C:D09 


CHI SCON 
CHI7COHLV 


1244 
1245 


21550 
17567 


RT A00002935F.f.2 1 . l.P.Seq 
RTA000029!SF.p.iLLP.Seq 


F 


M00033006A:F10 


CH08LNH 


1246 


20293 


RTA00002S8SF.j.20. l.P.Seq 


F 


M00001477D-.G09 


CHOICOH 


1247 


9520 


RTA00002927F.H. 15. LP.Seq 


F 


M00039642C:FOS 


CH12EDT 


1248 


2700 


RT A00002S89F.C.2 1 . 1 -P.Seq 


F 


M00001539C:F12 


CHOICOH 


1249 


25S91 


RT AO00O29O9F.p.23. 1 -P.Seq 


F 


M00022740OHU 


CH03MAH 


1250 


4298 


RTA0000290SF.C.22. 1 .P.Seq 


F 


M00022383C:AI2 


CH03MAH 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONTE ID 


LIBRARY 


1251 


20412 


RTA00002909F.e.02. l.P.Seq 


F 


M00022590B:E05 


CH03MAH 


1252 


29413 


RTA00002906F.m.23. l.P.Seq 


F 


M00022069D:C12 


CH03MAH 


1253 


12315 


RT A00002907F. 1 . 24.2 . P.Seg^ 


F 


M00022240D:B11 


CH03MAH 


1254 


4930 


RTA00002930F.g.04,2.P.Se(L 


F 


M00055810OD03 


CH15CON 


P55 


12018 


RTA00002924F.C. 16. l.P.Seq 


F 


M0003943SB:D08 


CH09LNL 


1756 


10501 


RTAO0002930F.h.23. l.P.Seq 


F 


M00056024B:F09 


CH15CON 


PV7 


11314 


RTA00002935F.k.03. 1 -P.Seq 


F 


M0005502JA:E11 


CHI7COHLV 


1258 


6426 


RT AOOO02927F.b. 15. 1 .P.Seq 


F 


M00039377B:H09 


CH12EDT 


1259 


2205 


RT A00002904F.C.03. 1 .P.Seq 


F 


M00O07195CEH 


CH02COH 


1260 


6991 


RTA00002922F.j. 12. l.P.Seq 


F 


M000390S1B:C04 


CH09LNL 


1261 


11928 


RTA0O0O2906F.h.05. l.P.Seq 


F 


M000219"iC:Blt 


CH03MAH 


L262 


28226 


RTA00002907F.n.20. l.P.Seq 


F 


M00022262B:B06 


CH03MAH 


1263 


16059 - 


RTA00002935F.j. 13. l.P.Seq 


F 


M000549~SC:F01 


CH17COHLV 


1264 


2252 


RTA00002886F.k.24. 1 .P.Seq 


F 


M0000136SA:A08 


CHOICOH 


1265 


4059 


RTA00002935F.f. 19. l.P.Seq 


F 


M000547UB:G10 


CH17C0HLV 


1266 


21795 


RTA00002901F.b. 16. 1 .P.Seq 


F 


M0000544:a:BL0 


CH02COH 


1267 


15049 


RTA00002935F.j. 10. l.P.Seq 


F 


M000549~"5B:E12 


CHI7C0HLV 


1268 


5565 


RTA00002930F.C.02. l.P.Seq 


F 


M0004290SA:F09 


CH15C0N 


1269 


20493 


RT AOOO02933F.a. 1 4. 1 .P.Seq 


F 


M000430"C:D12 


CH19C0P 


1270 


20257 


RT A00002934F.a. 14. 1 .P.Seq 


F 


M00043495CH05 


CH20COHLV 


1271 


16392 


RTA00002899F.a.05. l.P.Seq 


F 


M0000441SB:All 


CHOICOH 


1272 


15797 


RTA00002930F.C.23. l.P.Seq 


F 


M0005545cC:H06 


CH15C0N 


1273 


1811 


RT A00002S9 lF.d.2 1 . 1 .P.Seq 


F 


M000016S-D:E04 


CHOICOH 


1274 


17503 


RTA00002935F.f.08. l.P.Seq 


F 


M00054675D:G03 


CH17C0HLV 


1275 


14639 


RTA0OO02905F.h. 12. l.P.Seq 


F 


M0000SOT5D:DOl 


CH03MAH 


1276 


9146 


RTA00002934F.a. 10. l.P.Seq 


F 


M00043465CH11 


CH20COHLV 


1277 


10689 


RTA00002930F.h.l9.1.P.Se_g_ 


F 


M0OO56OC~B:C05 


CH15C0N 


1273 


11596 


RTA00002890F.e.23. l.P.Seq 


F 


M0000160"A:E04 


CHOICOH 


1279 


23731 


RTA00002930F.2. 18. l.P.Seq 


F 


M00055S".*D;C02 


CH15C0N 


1280 


25429 


RTAO0O02930F.h. 10. l.P.Seq 


F 


M000559SCC:B04 


CH15C0N 


1281 


1610 


RT A0000293 lF.b.24. 1 -P.Seq 


F 


M00042S5SC:Gil 


CH16C0P 


1282 


1176 


RTA00002935F.a. 10. l.P.Seq 


F 


M000424f*C:B06 


CH17C0HLV 


1283 


23578 


RTA00OO2930F.a.22. 1 .P.Seq 


F 


M000425T C A:D09 


CH15C0N 


1284 


17238 


RT A00002932F.a. 18.1 .P.Seq 


F 


M0004297CC:B01 


CHI SCON* 


1285 


1610 


RTAOO002931F.C.01. l.P.Seq 


F 


M00042S5SC:G11 


CH16C0P 


1286 


16366 


RT A00002932F.a.04. 1 .P.Seq 


F 


M000425S5A:HU 


CH1SC0N 


1287 


19709 


RTA00002932F.a. 12. l.P.Seq 


F 


M0004295 1D:G12 


CH1SC0N 


1288 


11027 


RTA00002932F.b. 10. 1 P.Seq 


F 


M000430i:-3:E03 


CHI SCON 


1289 


23451 


RTA00002933F.a.09. l.P.Seq 


F 


. M000426L"3:EOl 


CH19COP 


1290 


23731 


RTA00O02930F.fi. 1 8.2.P.Seq 


F 


M00055S7?D:C02 


CH15CON 


1291 


47898 


RTA00002911F.k.07.1.P.Seq 


F 


M0002705:A:EIO 


CH04MAL 


1292 


32581 


RTA00002932F.b.02. l.P.Seq 


F 


M0004296cD:E03 


CHI SCON 


1293 


42 


RTA00002934F.a.ll. l.P.Seq 


F 


M0004347CA:CIO 


CH20COHLV 


1294 


1447 


RTA00002935F.d. 14. l.P.Seq 


F 


M0005453c3:B0l 


CH17COHLV 


1295 


10449 


RT A00002935F.O.20. 1 .P.Seq 


F 


M0005539fD:Dll 


CH17COHLV 


1296 


35359 


RTA00002935F.h. 11. l.P.Seq 


F 


M00054SrD:Ali 


CH17COHLV 


1297 


19657 


RTA00002935F.I.20. l.P.Seq 


F 


M00055lccC:Dl0 


CH17COKLV 


1298 


12659 


RTA00002930F.L21. l.P.Seq 


F 


M00056l33A:Eli 


CH15CON 


1299 


9081 


RTA00002934F.a.22. l.P.Seq 


F 


M0004364CA:B01 


CH20COHLV 


1300 


17084 


RTAO0OO2935F.a. 14. l.P.Seq 


F 


M000425:l3:H04 


CH17COHLV 



WO 01/02568 



PCT7US00/18374 





C*\ T TCTTD 

ULU5TER 




UKJJfclN 1 A 1 iUri 


f t rwrv rn 
L.LUP»ti iJJ 


t mo a d v 

L.LOK .AK I 


1301 


11972 


RTA000O2935F.b.lO.l.P.Seq 


F 


;M00043336D:B03 


CH17COHLV 


1302 


11077 


RTA000029 jdF.c. 12. 1 .P.Seq 


F 


M00043402B:G07 


/"* T T 1 ™f r"At TT t / 

CH17COHLV 


1303 


126414 


RTA00002885F.a.O 1 . 1 .P.Seq 


F 


M00O4235OA:A05 


CH16C0P 


1304 


113291 


RTA00002935F.m. 15. 1. P.Seq 


F 


M00O55232A:E08 


CH17C0HLV 


1305 


13224 


RT A00002935 F. f. 1 5 . 1 .P.Seq 


F 


M00054693A:EI 1 


CH17C0HLV 


1306 


14883 


RTA00002930F.k. 14. 1 .P.Seq 


F 


M00056320B:A03 


CHbCON 


1307 


13363 


RTA00002935F.a.02. l.P.Seq 


F 


M00O42352D:B03 


CH17COHLV 


1308 


16869 


RTA00002889F.C.2 1 . 1 .P.Seq 


F 


M00001djjC:G1 1 


AtJA 1 /""AT T 

CH01COH 


1309 


16 


RTA00002935F.a.06. l.P.Seq 


F 


M00042449B : F05 


CH17COHLV 


1310 


4359 


RTA00002903FJ. 16. 1 .P.Seq 


F 


M00006994C:F06 • 


CH02CUH 


131 1 


20726 


f» t* • nnAA^A^n c~ _ nine 

RT A00002908F.a. 1 7. 1 .P.Seq 


F 


M00O22jo^C:DI)5 


\-HUJiMAri 


1312 


13713 


RTA00002924F. 1.09. l.P.Seq 


F 


% jf AAA^ O C f /~T\ \ 

MOOOjyoboC :C0 1 




1313 


29271 


RTA000029 j^F.d.20. 1 .P.Seq 


F 


M0005434SC:HUo 


LH1 /CUHLV 


1314 


6237 


*■» * a a a a a a ^ .r t~ a* i i i n c _ 

RTA000029.oF.t. 14. l.P.Seq 


F 


MOOOD4o8oA:r 1U 


CM 1 /tUHLV 


1315 


3472 


nm . ^\A/"vAAAAA f" 1 A^ i r* O 

RTA00002922F,n. 12. 1 .P.Seq 


F 


M000j9 I j4D:F08 


atjaat xrr 


1316 


186798 


RTA0O0O29 1 1 F. f. 1 1 . 1 .P.Seq 


F 


M000269 14C:H09 


L-H04MAL 


1317 


13193 


RTA00002886F.1. 16. l.P.Seq 


F 


M00001jo9A:G06 


ATJA 1 AAff 

CH01LOH 


1318 


3554 


RTA000029 l9F.i. 14. 1 .P.Seq 


F 


M00Ojj149B:E1O 


LH08LNH 


1319 


19991 


RTA00002908F.h.l 1. l.P.Seq 


F 


M00022446C:H06 


CHOjMAH 


1320 


173046 


RTA00002901 F.o. 19. 1 .P.Seq 


F 


M0000o/O^D:G10 


CH02COH 


1321 


21798 


. aaaaaa*> a r~ l i a l n c 

RTA000029j2F.b. 12, l.P.Seq 


F 


\ f AAA 1 r\ t ^ T> CAA 

M0004j016B:F09 


CHISLOiN 


1322 


11303 


t« t AAAAAfiAnf - "ii i r> f* 

RT A00002S98F.1. 11.1 .P.Seq 


F 


X f AAAA ,1 ''CA \ . C A 1 

M00004 j D 9 A : t U I 




1323 


4026 


RTA000029 15F.m.02.2.P.Seq 


F 


M00032494C : H08 


pu aoi vrr_r 


1324 


94859 


RTA00002909F.. .23. l.P.Seq 


F 


M000226^6D:D07 


CHOjMAH 


1325 


12315 


RTA00002907F.m.01. l.P.Seq 


F 


M00022240D:B 1 1 


T T A ^ \ f \ T T 

CHO^MAH 


1326 


4822 


RTA00002909F.1. 16. 1 .P.Seq 


F 


% rA/\AsAA/*AA * % a^ 

M00022690A: A07 


T T A -> \ f *. TT 


1327 


97129 


RTA00002909F.1. 13.1 .P.Seq 


F 


M000226;54A:E06 




1328 


15996 


RTA00002897F.I.09. 1 .P.Seq 


F 


M000042ft>l A:C04 


/-> T TA 1 AATT 


1329 


7209 


RT A000029 1 SF.c.0 1 . 1 .P.Seq 


F 


M000^2S.oD:G04 


/-^ TT A O f \ru 


1330 


111888 


RTA00002902F.h.08. l.P.Seq 


F 


M000006.- SC:C02 


CH02COri 


1331 


15642 


RTA00002902F. 2.06. l.P.Seq 


F 


X .(AAAA^ C 1 C \ . \ f\H 

M0000oo4cA: AO / 




1332 


20016 


RT A000029 1 6F. t.OD . 1 .P.Seq 


F 




funs T \JTJ 


1333 


21603 


RTA00002902F.a.05. l.P.Seq 


r 


M00Q00 / b * U : AU I 




1334 


156903 


RTA00002907F. 1.09.2. P.Seq 


F 


x fAAA'i a ^ Ai~\ ,nr\i 

M00022-OUB : BUD 


AITA1\ f \ TJ 


1335 


1425 


»r» ft aaaaaa i ^ i - l i A \ rt e* _ 

. RTA000029 1 6F.b. 19. 1 .P.Seq 


F 


M000j2d41C:GOj 




1336 


186061 


RTA0000291 lF.e.24. l.P.Seq 


F 


» (AAAO ^AAA \ . T T A"7 

M00026900A:H07 


CHU4M.AL 


1337 


20717 


RT.A00002907F.O. 19. 1 .P.Seq 


F 


% *ta/\aaaa.*^ » ^^rt^ 

M000222 . jA:E0j 


/— • T TA'» \ f \ ri 

CHOJivlAn 


1338 


12586 


RTA000028S7F.a.09. l.P.Seq 


F 


M00O01 j^^A:E07 


CH01COH 


1339 


19719 


RTA00002914F.h.23. l.P.Seq 


F 


M0002S212D:C05 


CH08LNH 


1340 


474 


RTA000029 1 7F.2. 15. l.P.Seq 


F 


MO0Oj2/2 . A:E04 


CH08LNH 


1341 


1 1907 


RTA00002923F.O.07. l.P.Seq 


F 


X f AAA** A I /"* AA1 

M00Oj9^61C:CO7 


/~i itaai vrr 


1342 


6806 


RTA00002928F.d.02. 1 .P.Seq 


F 


% * AA A i A 1 A \ /"* A ^ 

M00040 1 09 A: GOo 


CHI JbL) I 




I J 140 


K 1 AuUUUiov-r.i . lU._.r..jcq 


c 
r 


vinnnm*? i * a - fins 




1344 


16686 


RTA00002919F.f. 14. l.P.Seq 


F 


MO0O33O72A:A09 


CH08LNH 


1345 


6823 


RT.A00002S8SF.a.04. 1 .P.Seq 


F 


M00001433B:E02 


CH01COH 


1346 


43029 


RTA00002S97F.d.03. l.P.Seq 


F 


M000042:fD:EO3 


CH01COH 


1347 


14789 


RTA00002935F.k.U. l.P.Seq 


F 


M000550f5C:FOl 


CH17COHLV 


1348 


186061 


RTA0000291lF.f.0Li.P.Seq 


F 


M00026900A.-H07 


CH04\f.AL 


1349 


12823 


RTA00002921F.2.24. l.P.Seq 


F 


M00033434D:F05 


CH09LNT, 


1350 


25844 


RTA0000290SF.k.23. l.P.Seq 


F 


M000224"4B:C0S 


CH03MAH 



'35 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


PT I KTFR 
LLUO i n.t\ 


SEO NAME 


ORIENTATION 


CLONE ID 


LIBR.ARY 


net 


47793 


K 1 AU l AJVJ-y JUr.a.UJ,-.r .ocq 


p 


MflOOSSSOQ A R09 


CH15CON 


1352 


7695 


K 1 AUUUU-oVjir.n. lo. — r.ocq 


P 




CH01COH 


IOC" 


16997 


K 1 AU'JUUiV.-r.K- 15. l.r. icq 


C 

r 


Mfinn^o in - af p 


CH09LNL 


1354 


25441 


K 1 AUUUU-VUOr.l.Uo. 1 .r.ocq 


r 


lVl\/\JL/«i I/O I • u— 


CH03MAH 


1.555 




dx a nnnrP 007F n^fl 1 P On 
K 1 AUUUU-07 / r.o.-v. 1 .r.ocq 


p 


vf noon4 n q ^ n • ro7 


CHOICOH 


U50 


5741 


K 1 AUUUU-oo /r .c. i v. 1 .r.ocq 


p 

r 


moooo nQon-FO" 7 


CHOICOH 


1357 


17264 


K 1 AUUUU^yUUr.a. i 0. 1 .r.ocq 


F 


M0000d9" If-fil 1 


CH02COH 


I T CO 

1358 


11766 


DX \ r»n0rt707<C f* 70. 1 P On 

K. 1 AlHJUU_V25r.r.-iu. i .r.ocq 


F 
r 


M000 iQS"" 1 f-O05 


CH09LNL 


1359 


13618 


ox \nnnn7QQ7C rt 1> 1 P On 
K 1 AUUUU_393r.0.lJ. l.r.oeq 


p 


MOOOO'i Qrt * D ■ F0 1 


CHOICOH 


1 JoO 


1 j90j 


K 1 AUUUU-Vijr.C. 10. L. r.ocq 


p 


M000 iQ n 04 AE09 


CH09LiSfL 


1 IfL 1 

i JOl 


lUO/J 


ox a nnnrPQ7 7p h 77 1 p c^n 
K l / r.n.« j. i.r . ocq 


F 


M000196-i6 \ E06 


CH12EDT 


13o2 


1/412 


dx a nnrifnG77F u 1 1 1 p On 


F 


M000410 1 ^DD05 


CH18C0N 


1 16.1 

1303 


22 lis 


0 x a 00.0070 1 qc « 70 1 p On 
K 1 .-\UUUU»7 1 y r.u.-w. l .r.ocq 


F 

r 


MO0O330' ? SC• A0° 


CH08LNH 


1J04 


JOJO 


ox Afinn.n.7Q77F i ni 1 p On 


F 
* 


M00039°7^BE02 


CH09LNL 


1 JOJ 


nciri 


px Annnn7 9QQp k id 1 p On 

K 1 nUUUU.37or.O. l.r.ocq^ 


F 


M0OOO4316AB03 


CHOICOH 


1300 


<SU5U 


pXAnnno^anoF n oj. 1 p On 
K. 1 1 *\uuuu-7\jur.n.u4, 1 .r.ocq 


F 


M000053S' ; A C 11 


CH02COH 


130/ 


1 Q A •> 1 C 

1005 jo 


i\. 1 /avJUUU— 7— yr.c. 10.1 .r.ocq 


F 


M000403^9 AH05 


CH14EDT 


lioo 


Z542 / 


ox \ nnfin^o^sp n 1 P 
K 1 AUUVjU-y j jr.n._v. 1 .r.ocq 


F 

r 


M00055"""BC04 


CH17C0HLV 


i n £ A 

1jo9 


24098 


d x \ nnnn^on 1 P 1 1 A 1 P 

K 1 AUUUU-yui r.3. IU. 1 .r.ocq 


p 
r 


M0000S4 ^ ^ D • HO" 7 


CHOICOH 


1370 


123823 


ox a nnnn~> qacc k aq i p c* n 
K 1 AUUUU^yU5r.n.Uo. 1. r.ocq 


F 
r 


M0000S07 1 DH03 


CH03M AH 


1371 


3644 


0 x a nnon7QA 1 p « rn 1 p 


p 

r 


M00OOS44^DD04 


CHOICOH 


1372 


277Sj 


ox a nnnmo 1 7P a \i 1 p 

K 1 AUUUU-iy L /r.J. 1 /. 1 .r.ocq 


p 
r 


M0003^666 A C0° 


CH08LNH 


1 777 

13/3 


I0o2 


dt Afinnn^Q l HP K 0^ 1 P ^on 




MOOO^SO ? D-D09 


CH03MAJi 


1 77-1 

13 /4 


Tin A 


dx Annrin7Q07p ,» n7 1 p ^r»n 
K 1 auuuu.oq /r.c.u /. 1 .r. jcq 


F 


MOOOO 1 19 -C F04 


CHOICOH 


13/5 


o44i 


ox a nnnn j 7P h 1 p ^»»n 


P 


M000327" : -BEP 


CHOSLNH 


1 77A 
13/0 


15j5j 


ox AnnorpQ inp 1 1 1 P ^i=»n 
K l AUUUU-V 1 wr.e. 1 1. 1 .r.ocq 


F 


M0007^S^-iCG07 


CH03NLAH 


13 / / 


0jl4 


PXAAnnn^Q77p K Ofi I p Q,»fi 


F 


M0003861SD-D08 


CH09LNX 


13 /o 


yj54y 


dx a orinn7onQp i ij. i P Q^n 
K i Auuuu_yuyr.j. 1*+. 1 .r.ocq 


p 


M00O? r> 66~C-H04 


CH03MAH 


i no 
13 /V 


1 < IAiC 


ox a nnnn^onAP n n - ^ ( P 
K I .-\UUUU_vuor.p.uo. i .r .oc-j 


p 


MOOO^OS^B-rtfP 


CH03MAH 


1 7 OA 

13oU 


16572 


K 1 AUUUU-ooOr.N.UJ. 1. r.ocq 


p 
r 


M0000 1 ^6-1 A C09 


CHOICOH 


1381 


74821 


K 1 AUUUU-oyUr.p.-l.l.r.oeq 


P 
r 


\f 00O0 IM'A'AP 


CHOICOH 


i i on 

1382 


1 1315 


OX \ r\AriA7CQQC A IT 1 P Cj»n 

K 1 AUUUUj.ooyr.a. l_. 1 .r.ocq 


p 
r 


M00001 7^ C BB 10 


CHOICOH 


1383 


10859 


DX \ AnnA7 0Q / 1 c |Q I P C^n 

K 1 AUUUU_oy4r.L. lo. J .r .ocq 


p 
r 


vroooo'i q s r- n - r 06 


CHOICOH 


1384 


15o91 


ox a Annfi7G i au f ru i p 

K I AUUUU_y l4r.i.U-+. 1 .r.ocQ 


p 


MOOO^S 19 *R E07 


CHOSLNH 


Uo5 


2 J I /2 


o x a nnnA"> QQAP K 19 1 P 
K l AUUuu^oyor.u. to. i. r.ocq 


F 


M 00004 14' B F08 


CHOICOH 


1 7QA 
1 J00 


T~> c 1 A 

22510 


ox \nnnn7QQAF i c\> i p ^»>n 
K l AUUUU-ooOr.i.uo. i .r.ocq 


F 


MOOOO 136 > A -CO 0 


CHOICOH 


1 7Q7 

138/ 


1 7 1 <?£ 

1/156 


ox \ nnnn^O"? i P i H9 i P 

K 1 AUUUU-y j4r.u.U0. i .r.ocq 


F 
r 


M 0004345 ^ B COS 


CH20COHLV 


i iflo. 

i J0O 


4593 


px Annnn^^OAP a 19 i p 
t\ i .-\uuuu_ovor.u. i o. i .r.ocq 


F 


M 00004^000- A04 


CHOICOH 


I "5 OA 

1j89 


1 1 "TO 

2 178 


K 1 AUUUU-yUl r.m.Uo. 1 .r.ocq 


F 
r 


MOOOOSfi^^DG! 1 


CHOICOH 


i 7aa 
139U 


1015 


K 1 AUUUU_yj jr.C. 1 1. L. r.ocq 


F 


\ 10004 "V? 1 " A-D05 


CH19C0P 


I J? 1 


20 /92 


ox a nnnrpori7F a i<? i P 
K 1 AUUUU-VU / r.a. lo. i .r.ocq 


F 


\r000^^10 : CD05 


CH03M.\H 


1392 


"27830 


RTA00OO2921F.c.07.1.P.Seq 


F 


M0003334-:A:B06 


CH09LNL 


1393 


14648 


RTA0000:S9SF.j.li.l.P.Scq 


F 


M0000436:C:Gll 


CHOICOH 


1394 


I25S5 


RTA0000:S97F.i.20.I.P.Seq 


F 


M0000426^A:F11 


CHOICOH 


1395 


15825 


RTA000029l6F.d.l2.1.P.Seq 


F 


M00032553A:A07 


CH08LXH 


1396 


7043 


RTAOOOO:900F.h.07.1.P.Seq 


F 


M0000501-3:F02 


CH02COH 


1397 


29354 


RTA00002905F.C. 1 3.1 .P.Seo 


F 


M000079S1C:F07 


CH03M.\H 


1398 


29703 


RTA00002907F.d.24. 1 .P.Seq 


F 


M0002214-C:E12 


CH03M.AH 


1399 


6S11 


RT AOO00Z9 1 3F.D.07. 1 .P.Seq 


F 


M0002772O:D04 


CH04M.AL 


1400 


12657 


RTA00002906F.b.20. 1 .P.Seq 


F 


M0002186cC:HOS 


CH03M.AH 



\7fr 



WO 01/02568 



PCT7US00/18374 



SEQ 
ID 


C\ USTFR 


SEQ NAME 


ORIENTATION 


CLONE ID 


LEBR.ARY 


1401 


in^ ■** 


dt Annf)fP c P n F e 08 "* P Sea 


F 


M00039024D:E12 


CH09LNL 


1402 


24229 


dt Annnfno^oF h 04 1 P Sea 


F 


M00033329CC02 


CHOSL4VH 


1403 


20664 


dt AnnnrpS9nF i 07 I P Sea 


F. 


M0000133SC:F05 


CHOICOH 


1404 


3656 


dt AnnnrnorPF f ifl l P Sea 


F 


M00006641B:F05 


CH02COH 


1405 


10998 


dt a nnnnoo ^ l F r 07 1 P Sea 


F 


M0004287SD:G06 


CH16COP 


1406 


1 150 


dt a nnnrn QO^F i 14 l P Sea 


F 


M00039081B:G07 


CH09LNL 


1407 


A C11 I 

45221 


DTAnnnn^QOOF h 06 1 P Sea 


F 


M00005013D:H05 


CH02COH 


1408 


i a cnc 


rt AnnnfpQOlF a 16 1 P Sea 


F 


M00005423C:A10 


CH02COH 


1409 


Oi 7C 

81/5 


RTAnnofpQ'MF F01 1 P Sea 


F 


M00039472B:E05 


CH09LNL 


l .« in 
141U 


ol / J 


RT AOO00' ) 9 7 4F e.24. 1 .P. Sea 


F 


M00039472B:E05 


CH09LNL 


i -i i t 


1VJ / J 


RT AOOOO' , 903F n 02.1. P. Seq 


F 


M000O7O81B:C08 


CH02COH 


1412 


lUoOO 


RTAOOOO°9 n 9F c 15.1. P. Seq 


F 


M00040219B:B07 


CH14EDT 


1413 


24100 


RTAnnno^89iF k 07. LP Sea 


F 


M00003763A:B02 


CHOICOH 


1 A 1 A 

1414 


153AJ 


RT AOOOO^SSSF c P I P Sea 


F 


MO00OI442C:G12 


CHOICOH 


1415 


/* >1 yl T a 


RTAfinfW907F h 17 1 P Sea 


F 


M00022I17C:A02 


CH03MAH 


1416 




rt AnnofPO'iOF a 16 1 P Sea 


F 


M00042560CG06 


CH15CON 


1417 


1111*7 

1231 / 


oTAnnno^QOSF ° 13 1 P Sea 


F 


M00022430C:C06 


CH03MAH 


1418 


i i nico 
1 19oo 


R T A fifififP SQ0F i ^4 t P Sea 


F 


M00001625D:B04 


CHOICOH 


1419 


1 A 1 Q 1 

1413 1 


rt Annnn^QOSF n 09 1 P Sea 


F 


M00022499D:D08 


CH03MAH 


i Jin 
1420 


1 CIsQ 

1535y 


RTAnnnn^Q09F 1 0^ l p Sea 


F 


M00022677C:C01 


CH03MAH 


1421 


400/5 


RTADnno^916F h 03 1 P Sea 


F 


M000325S*iA:D06 


CH08LNH 


1422 


Ivl ooo 


rt Ahnno^OOlF k 17 1 P Sea 


F 


M00007019B:E01 


CH02COH 


1423 


130424 


rt Annnn^QOSF m ^ 1 P. Sea 


F 


M00021653A:B02 


CH03MAH 


1424 


l lyyo 


rt a nnnfPQO l F b ^4 1 P Sea 


F 


M00005445 A:E07 


CH02COH 


1425 


nyyo 


RT AfinflfPQOlF r 01 1 P Sea 


F 


M00005445A:E07 


CH02COH 


1426 


AHQ 1 

4784 


rt Annno^SQiF e ^0 L P Sea 


F 


M000039SSD:B01 


CHOICOH 


1427 


n i in 

y liu 


rt a nnnn^Q 1 4F h 10 1 P Sea 


F 


M00028210B:H03 


CH08LNH 


1428 


i i in< 


rt a nnfifp SQOF i M 1 P Sea 


F 


M00001632C:AIO 


CHOICOH 


1429 


inn i 

3yy i 


rt AfWlfPSOnF h 05 1 P Sea 


F 


M0000416:D:F02 


CHOICOH 


1 A1f\ 

14jU 




RT AOOOO' , 908F b 06 1 P Seq 


F 


M0002236"D:G11 


CH03MAH 


1 A1 1 

14J 1 


l«£o_J 


RT^OOO^IF h.OLl.P.Seq 


F 


M00033434D:F05 


CH09LNL 


14JZ 


14/41V 


RTA00OO^906F o 05 1 P.Seq 


F 


M00021952B:G06 


CH03MAH 


\A11 

14^j 


10 1 7 J. 


RT\0000^9l9F f. 1 3. 1. P.Seq 


F 


M00O33O71D:EO8 


CH08LNH 


1 A 1 A 

1434 


550U3 


RT A0n00^S97F 0^4 1 P.Sea 


F 


M00004296B:D03 


CHOICOH 


1 A "3 C 

1435 


in; 
2j25 


rt Annno^S94F e 07 I P Sea 


F 


M00003994A:BIO 


CHOICOH 


I A 1C 

I43o 


looioi 


RTAftfWPQOSF 1 05 1 P Sea 


F 


M00022475D:C07 


CH03M.AH 


1437 


5 /I J 


RTAnnnn^o^OF a 09 1 P Sea 


F 


M00033324B:F04 


CH08LNH 


i a i a 

143o 


3624 


r t a nnno^Q i OF a 06 I P Sea 


F 


M00022901A:C05 


CH03MAH 


i a in 

1439 


10j05 


RT Annno^QOQF 3 07 I P Sea 


F 


M00022530B:C04 


CH03M.AH 


t A A(\ 

1440 


77oo 


rt AnnoO^Q I0F k n L P Sea 


F 


M00022992B:GL2 


CH03MAH 


1 A A 1 

1441 


y<54/ 


RT -xnooo^QOSF o 07 I P.Sea 


F 


M00022516B:C05 


CH03M.AH 


I44-? 


8583 


RTA000028S7F.O.06. L. P.Seq 


F 


M0000t426C:F06 


CHOICOH 


1441 


24376 


RTA000O2900F.b. 07.1. P.Seq 


F 


M000O4S36B:C02 


» IA1PALI 

CH02COH 


1444 


8743 


RTA00002907F.n. 19.1 .P.Seq 


F 


M00022262A:F06 


CH03MAH 


1445 


22251 


RT A000O2926F.C. 10.2. P.Seq 


F 


M000400:9B:F06 


CH09LNL 


1446 


12337 


RTA0000292SF.d.07.l.P.Seq. 


F 


M00040173D-.A04 


CHI3EDT 


1447 


13623 


RTA0000291 lF.d.08.2.P.Sea 


F 


M00026S42B:AOl 


CH04MAL 


1448 


5521 


RT A00002887F.J.06. 1 -P.Seq^ 


F 


M00001406B:H09 


CHOICOH 


1449 


2193 


RTA00002933F.a.l3.l.P.Seq 


F 


M0004307"B:Fll 


CHI9COP 


1450 


773 


RTA00002SS9F. j.02. 1 .P.Seq 


F 


M00001551D:H09 


CHOICOH 



WO 01/02568 



PCTAJS00/18374 



w 

ID 



CLUSTER 



SEQNAME 



ORIENTATION 



CLONE ID 



LLBRARY 



1451 



142367 



RTA0Q0Q2927F.h, L 1 . 1 .P.Seq 



M0003963OD:B07 
M00001537B:HI0 



CH12EDT 



L452 



19284 



RTA00002SS9F.e. 1 0. 1 .P.Seq 



CH01COH 



1453 



24011 



RTA00002924F.C. 17. i. P.Seq 



M0003944QC:GQ6 
M0Q026910B:G06 



CH09LNL 
CH04MAL 



1454 



5930 



RTA000029 UF.t.08. 1. P.Seq 



1455 



21581 



RTA00002902F.C.05. 1. P.Seq 



M00005822C:A04 
M00039826D:EO4 



CH02COH 



1456 



1457 



1458 



3662 



RTAQ0Q02925F.C.07.1 .P.Seq 



CH09LNL 



4873 



RTA00002930F.b.05.1. P.Seq 



M00042719A:G08 



CH15CON 



11214 



RTA00002396F.h.01.1 .P.Seq 



M00004161A:E08 



CHOICOH 



1459 



22888 



RTA00002892F.I.Q9. l.P.Seq 



MO0003837C:D10 
M00039932B:A07 



CHOICOH 



1460 



15490 



RTA00002925F.k.08. 1 .P.Seq 



CH09LNL 



112819 



RTA00002905F.o.l3.1.P.Seq 



M00O21676C:GO3 
M00004179D:A12 



CH03MAH 



1462 



19683 



RTA0O0O2896F.1.02.1 .P.Seq 



CHOICOH 



15132 



RTA00002922F.n.2Q. 1. P.Seq 



MQ003913SB:G05 
M00023219B:H05 



CH09LNL 



1464 



25022 



RT A000029 1 4F. i .2 1 . 1 .P.Seq 



CH08LNH 



1465 



16303 



RTA00002888F.b. 12. 1 .P.Seq 



M0000143SA:E01 



CHOICOH 



16828 



RTA0Q0O2397F.b.04. l.P.Seq 



M00004214A:E05 



CHOICOH 



1467 



14295 



RTA0000292 lF.a.18. 1 .P.Seq 



M0Q033296CCH 
M00055725D:D09 



CH09LNL 



1468 



1979 



RTA0QQ02930F.f.06. l.P.Seq 



CH15C0N 



1469 



36248 



RTA00OO2888F.g.Q5. L.P.Seq 



M0000146GC:E1Q 
M000400753:A05 



CHOICOH 



1470 



1471 



5676 



RTAQ0QQ2926F.b.22.2.P.Seq 



CH09LNL 



1239 



RTA0Q002337F.O.2 1 . 1 .P.Seq 



M0000142SB:C10 
M0003272SD:F01 



CHOICOH 



1472 



7937 



RTA000Q29 1 7F.g.22. 1 .P.Seq 



CH08LNH 



1473 



4483 
7796 



RTA0Q00291 lF.d.22.2.P.Seq 



M000268563:G03 
M00039826B:F09 



CH04MAL 



1474 



RTA00002925F.C.Q5. l.P.Seq 



CH09LNL 



1475 



17330 



25620 



RTA000Q29 1 5F.a.Q3. l.P.Seq 



M00023616C:D09 



CH08LNH 



RTA00002902F.f.09. 1 .P.Seq 



M00006631CA04 



CH02COH 



20601 



RT A00002923F.1.20. 1 .P.Seq 



M00039326A:G07 
M0003925SC:C01 



CH09LNL 



1478 



6205 



RTA00OO2923F.g.21. l.P.Seq 



CH09LNL 



1479 



726 



RTA0Q0O29 13F.b. 16. 1 -P.Seq 



M00027734D:C03 
M000224353:G12 



CH04MAL 



104999 



RTA000O29O8F.g. 17. l.P.Seq 



CH03MAH 



30321 



RTA000029 19F.0. 17. 1 .P.Set 



M00033264B:E06 
M000276SSC:C01 



CHOSLNH 



1482 



1483 



1484 



5373 



RTAQ0Q029 13F.a. 16. l.P.Seq 



CH04MAL 



5944 



RTA00002905F.m.Q7. 1 .P.Seq 



M0002 1649B:A02 



CH03MAH 



5796 



RTA00002908F.i.21. l.P.Seq 



M0002245"A:G05 



CH03MAH 



3804 



RTA000Q2935F.m.24. 1 .P.Seq 



M0005525--lA:H03 



CH17C0HLV 



I486 



2728 



RTA000029 18F.a.22. l.P.Seq 



M0003282SA:A06 
M00055234A:H03 



CH08LNH 



1437 



148S 



3804 



RTAQ0002933F.n.01 . 1 .P.Seq 



CH17C0HLV 



3932 



RTA00QO29 1 SF.o. 1 9.2.P.Seq 



M0003 25 rC:E 10 



CHOSLNH 



16691 



RTA00002S91F.O.03. l.P.Seq 



MOOO037SCA:GOl 
MOOO050O3D:CO2 



CHOICOH 



1490 



15430 



RT A0OQO290OF.g. 10. l.P.Seq 



CH02COH 
CH09LNL 



1491 



1492 



1493 
1494 



5637 



RTA00002925F.b. 18. 1 .P.Seq 



M00039S20B:F06 



16633 



RTA00002897F.g. 15. 1 .P.Seq 



M0000424c3:HQ7 



21826 



RTA00002898F.g.06. l.P.Seq 



M0000434-:A:G11 
M0O033UcD:A03 



CHOICOH 
CHOICOH 



1495 



1496 



1497 



22193 



RTA000029 19F.L09. 1 -P.Seq 



CHOSLNH 



10720 



RTA00002898F.C. 14. l.P.Seq 



M0000432CC:E0" 



CHOICOH 



22491 



RTA00002923F.m.06. 1 .P.Seq 



M00040003A:G10 



CH09LNL 



10423 



RTA000029 15F.n.l3.2. P.Seq 



M0O0325O"D:GO8 



CHOSLNH 



4953 



RTA00O029l6F.h. 11. l.P.Seq 



M000325ScC:B04 



CHOSLNH 
CH04MAL 



1499 



185567 



RTA0Q0029 1 lF.p.OS. l.P.Seq 



M0002717S3:AU 



25605 



RTA00002924F.m.22. 1 .P.Seq 



M0003 l )7lC3:A0l 



CH09LNL 
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s£Q 

CD 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


1501 


29446 


RTA00002906F.rn.24. LP.Sea 


F 


M00022070B:B04 


CH03MAH 


1502 


9668 


RTAOO0O2908F.g.02.1.P.Seq 


F 


M00022421A:F12 


CH03MAH 


1503 


29446 


RTA00002906F.n.01.1.P.Seq 


F 


M00022070B:B04 


CH03MAH 


1504 


7171 


RTA00002887F.m.22. LP.Seq 


F 


M0000I421B:E07 


CH01COH 



■ VI 
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Table 3 





Nearest Nciehbor fBlastN vs. Cenbank) 


Nearest Neighbor (BlastX vs. No n- Redundant Pro 


teins ) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 


1 


<NONE> 


<NONE> 


<NONE> 


<N0NE> 


<NONE> 


<NONE> 


2 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


3 


<NONE> 


<NONE> 


-MO MP-. 




<NONE> 


<NONE> 


4 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


5 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


6 


<NONE> 








<NONE> 


<NONE> 


7 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


S 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 
<NONE=> 


<NONE> 
<NONE> 


9 
10 


<NONE> 
<NONE> 


<NONE> 
<NONE> 


<NONE> 
<NONE> 


<NONE> 
<NONE> 


<NONE> 


<NONE> 


11 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


12 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


13 


<NONE> 


<NONE> 


<NONE> 


<NONE> 
<NONE> 


<NONE> 
<NONE> 


<NONE> 
<NONE> 


14 

15 


<NONE> 
<NONE> 


<NONE> 
<NONE> 


<NONE> 
<NONE> 


<NONE> 


<NONE> 


<NONE> 


16 


<NONE> 


<NONE> 


<NONE> 


<NONE> 
<NONE> 


<NONE> 
<NONE> 


<NONE> 
<NONE> 


17 
IS 
19 
20 


<NONE> 
<NONE> 
<NONE> 
<NONE> 


<NONE> 
<NONE> 
<NONE> 
<NONE> 


<NONE> 
<NONE> 
<NONE> 
<NONE> 


<NONE> 
<NONE> 
<NONE> 


<NONE> 
<NONE> 
<NONE> 


<NONE> 
<NONE> 
<NONE> 


21 
22 

23 


<NONE> 
<NONE> 

<NONE> 


<NONE> 
<NONE> 

<NONE> 


<NONE> 
<NONE> 

<NONE> 


<NONE> 
<NONE> 

548562 


<NONE> 

<NONE> 
GENOME POL X V ku 1 1 un 
[CONTAINS: RNA 
REPL1CASE ; HELICASE: 
COAT PROTEIN] 2.7.7.4S) - 
apple stem grooving virus 
(strain P-209) 


<NONE> 
<NONE> 

9.: 


24 


<NONE> 


<NONE> 


<NONE> 


416959 


EXCISION REPAIR PKU I fckN 
ERCC-6 DNA repair helicase 
ERCC6- human >gi|l82l81 
(L04791) excision repair protein 
[Homo sapiens] 


3.9 


?i 


<NONE> 


<NONE> 


<NONE> 


3327096 


(ABO 14541) KIAA0641 protein 
[Homo sapiens] 


8.T 


26 


<NONE> 


<NONE> 


<NONE> 


861293 


(U28741) F35D2.1 gene 
product [Caenorhabditis 
eleaans] 


7 9 


27 


<NONE> 


<NONE> 


<NONE> 


Tin - ? oi i 

j^y /oil 


f AJL031O32) eiucnsin-Uke 
protein 


5.5 


2S 


<NONE> 


<NONE> 


<NONE> 


2119692 


transforming growth factor- beta 
type III receptor - chicken 
>gi|511S43(L0112H 
transforming growth factor-beta 
tvpe 111 receptor [Galium gallus] 
protein kinase PRKl - human 


5.1 
5.0 


29 


<NONE> 


<NONE> 


<NONE> 


2136028 







Ho 
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Nearest Neighbor fBlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















30 


<NONE> 


<NONE> 


<NONE> 


2746912 


( AF040659) No definition line 
found [Cacnorhabditis elegans] 


4.6 


31 


<NONE> 


<NONE> 


<NONE> 


2358287 


(AF0 10404) ALR (Homo 
sapiens] 


4.5 


32 


<NONE> 


u 

<NONE> 


<NONE> 


3877816 


(Z96048) predicted using . 
Genefmder; cDNA EST 
EMBLD65516 comes from this 
gene; cDNA EST ykl91a5.5 
comes from this gene 
[Cacnorhabditis elegans] 


4.4 


33 


<NONE> 


<NONE> 


<NONE> 


4140268 


(Y 14953) SRCR domain, 
membrane form 2 


4.1 


34 


<NONE> 


<NONE> 


<NONE> 


1708663 


(U51 183) transposase [Hydra 
vulgaris] 


4.0 


35 


<NONE> 


<NONE> 


<NONE> 


1184100 


(U45958) pistil extensin-like 
protein fNicotiana alata] 


3.9 


36 


<NONE> 


<NONE> 


<NONE> 


121073 


GLUCOCORTICOID 
RECEPTOR (GR) 


3.9 


37 


<NONE> 


<NONE> 


<NONE> 


1718298 


(U75698) ORF 45; contains an 
extended acidic domain; EB V 
BKRF4 homolog [Kaposi's 
sarcoma- associated herpesvirus] 
homolog, conserved in other 
gamma-herpesviruses 


2.6 


38 


<NONE> 


<NONE> 


<NONE> 


2352538 


(AF006564) alcohol 
dehydrogenase [Drosophila 
persimilisl persimilis] 


1.4 


39 


<NONE> 


<NONE> 


<NONE> 


3192897 


(AF066071)SP85;PsB 
[Dictyostelium discoideurn] 


1.4 


40 


<NONE> 


<NONE> 


<NONE> 


561645 


(L33421) This CDS feature is 
included to show the translation 
of the corresponding V_region. 
Presently translation qualifiers 
on V_region features are illegal 


1.0 


41 


<NONE> 


<NONE> 


<NONE> 


3878S57 


(Z.Bji_£d) predicted using 
Genefinder, cDNA EST 
EMBL:D35016 comes from this 
gene; cDNA EST 
EMBL;D32583 comes from this 
gene; cDNA EST 
EMBL.D35258 comes from this 
gene; cDNA EST 
EMBL:CI 1471 comes from this 
gene; cDNA EST EMBLrC... 


1.0 
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SEQ 
ID 


Nearest N< 
ACCESSION 


•iehbor (BlastN vs. Ge 
DESCRIPTION 


nbank) 
P VALUE 


Nearest Neighbo 
ACCESSION 

< 


r (BlasiX vs. Non-Redundam Pro 

DESCRIPTION 
U75903) UGT1A7 [Rattus 


teins) 
P VALUE 


42 
43 


<NONE> 
<NONE> 


<NONE> 
<NONE> 


<NONE> 
<NON£> 


1658571 i 
2338034 


norveeicus] 

AF005370) putative immediate 
early protein [Alcelaphine 
herpesvirus 1] 


1.0 
0.86 


44 


<NONE> 


<NONE> 


<NONE> 


30437 14 


(AB01U67) KIAA0595 protein 
[Homo sapiens] 


0.42 


45 


<NONE> 


<NONE> 


<NONE> 


1723710 


rlVFUlHbllLAL WJIOI 
PROTEIN IN ASN2-PHB1 
INTERGENIC REGION 
>gi|213l678|pir|lS64439 
hypothetical protein YGR130c - 
yeast (Saccharomyces 
cerevisiae) 

> g ili3232l5|gnl|PID|e243523 
(Z72915) ORF YGR130c 
[Saccharomvces cerevisiae] 


0.40 


46 


<NONE> 


<NONE> 


<NONE> 


1723710 


'HYFUlHhlKJAL KB 
PROTEIN IN ASN2-PHB1 
INTERGENIC REGION 
>gi|2131678|pir||S64439 
hypothetical protein YGRI30c - 
yeast (Saccharomyces 
cerevisiae) 

>gi|1323215|gnl|PIDle243523 
(Z72915) ORE YGR130c 
[Saccharomvces cerevisiae] 


0.38 


47 


<NONE> 


<NONE> 


<NONE> 


2996117 


(AF046125) immediate early 2 
(Rat cytomeaalovirus] 


0.26 


48 


<NONE> 


<NONE> 


<NONE> 


4151809 


(AF102855) synaptic SAPAP- 
interactine protein Synamon 


0.024 


49 


<NONE> 


<NONE> 


<NONE> 


2773341 


(AF040954) putative protein 
phosphatase 1 nuclear targeting 
subunit [Rattus norvegicus] 
(D90914) hypothetical protein 


0.017 
3e-04 


SO 

51 


<NONE> 
<NONE> 


<NONE> 
<NONE> 


<NONE> 
<NONE> 


1653522 
3219965 


HYPOTHETICAL lUU.oKU 
TRP-ASP REPEATS 
CONTAINING PROTEIN 
C2C6.04C IN CHROMOSOME 
I 


3e-06 


52 


<NONE> 


<NONE> 


<NONE> 


4185567 


(AFU5480) cAMP-dependent 
Rap! guanine-nucleotide 
exchange factor [Mus musculus 


7e-07 



WO 01/02568 



PCT/US00/18374 



: ' - 


Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Pro 


teins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












HYMTHETICaL 415 Kli 




53 


<NONE> 




<rNONF> 


1 176527 


PROTEIN C34E10.1 IN 
CHROMOSOME III 
>gi|500724 (U 10402) C34E10.1 
gene product [Caenorhabditis 
eleeans] 


3e-20 


54 


X85444 


G.pallida repetitive 
DNA element 


5,0 


2118936 


beta-globin - chimpanzee 
(fragment) 


8.6 


55 


X7296 L 


Synechococcus sp. 
cpeB, cpeA genes and 
ORr3 




462569 


MlCROTUB OLE- 
ASSOCIATED PROTEIN 1A 
micro tubule-associated protein 
MAPlA-rat >gi|205538 
norveeicusl 


2.2 


56 


U94747 


Human WD repeat 
protein main i i 
mRNA. complete cds 


5.0 


3875538 


(Z67990) similar to cuticle 
collagen 


1.3 


57 


AF032108 


Homo sapiens 
integrin alpha-7 
mRNA. complete cds 


5.0 


2147194 


collazen - Paralvinella grasslei 


0.002 


58 


Z50798 


G.gallus mRNA tor 
P 52 


5.0 


3122885 


ASPARTYL-TRNA 
SYNTHETASE synthetase 
[Bacillus subtilis] 


3e-ll 


59 


AB002384 


Human mRNA tor 
KIAA0386 gene, 
complete cds 




2632098 


(Y15513)Prodos protein 
[Drosoohila melano paster] 


9c- 12 


60 


X 14835 


Thermofllum pendens 
DNA for 16S and 
23o riDosomai KiN/\ f 
tRNA-Met, and tRNA 
Gly 


4.9 


<N0NE> 


<NONE> 


<NONE> 


61 


U87I49 


Hordeum vulgare 
nucellin gene, 
complete cds 


4.9 


128578 


NONSTRUCTURAL 
PROTEIN NS-S spotted wilt 
virus (strain CPNH1) non- 
structural protein [Tomato 
spotted wilt virus] 


2.8 


62 


D87541 


Mus musculus gene 
for integrin alpha v 
subunit, promoter 
region 


4.9 


136956 


HYPOTHETICAL PROTEIN 
UL61 cytomegalovirus (strain 
AD 169) cvtomesalovirusl 


0.038 


63 


U72520 


Mus musculus mcna 
protein (Mena) 
mRNA. complete cds 


4.9 


3413892 


(AB007934) KIAA0465 protein 
[Homo sapiens] 


6c-07 
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Nearest Neighbor fBlaslN vs. Genbanic) 


Nearest Neiehbor (BlastX vs. Non-Redundant Pro 


teins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 


64 


i 

1 
i 

S79797 


glycosylation- 
regulating gene [rats. 
Sprague-Dawley, 
streptozotocin 
diabetic, heart, 
mRNA, 5010 ml 


4.8 


<NONE> 


<NONE> 


<NONE> 


65 


ABO 11 102 


rlomo sapiens nuvi^rv 
forKIAA053G 
protein, partial cds 


4.8 


138022 


RECEPTOR RECOGNIZE U 
PROTEIN gp38 - phage 0x2 
>gi|15126 (X05675) gene 38 
(AA 1-266); pid:g 15 126 
[Bacteriophage 6x2] 


3.6 


66 


AF100985 


Penaeus monodon 
phosphopyruvate 
hydratase mRNA, 
complete cds 


4.8 


500615 


(D 1 6221) endochitinase [Oryza 
satival 


2.8 


67 


U31756 


Bacillus subtilis 
gamma- 
aminobutyrate 
permease cds 


4.8 


3880699 


(AL021471) similar to 
Eukaryotic aspartyl proteases 
[Caenorhabditis elegans] 
Eukaryotic aspartyl proteases 
[Caenorhabditis elegans] 


2.8 


68 


U25U1 


risum sauYuni 
chloroplast 
processing enzyme 
mRNA, nuclear gene 
encoding chloroplast 
protein, complete cds. 


4.8 


1800145 


(U83658) FH1/FH2 protein 
homoloa [Emericella nidulans] 


1.6 


69 


U00454 


Mus musculus Cdx-2 
homeobox protein 
gene, complete cds. 


4.7 


<NONE> 


<NONE> 


<NONE> 


70 


M84166 


Hamster c-Ha-ras 
protein gene, 
complete cds. 


4.7 


1710606 


RENIN-BiNDING PROTEIN 
(RNBP) protein [Rattus 
norveeicusl 


0.88 


71 


AF087516 


Mus musculus major 
sperm fibrous sheath 
protein Pro- 
mAKAP82 gene, 
alternative splice 
exons 1' and 1" 


4.6 


<NONE> 


<NONE> 


<NONE> 


72 


X74160 


M.esculenta mRNA 
for granule-bound 
starch synthase 


4.6 


<NONE> 


<NONE> 


<NONE> 


73 


M97487 


Haloferax volcanii 
superoxide dismutase 
(sod2) gene, complect 
cds. 


4.6 


2623307 


(AC002409) putative ubiquitin 
protease [ Arabidopsis thaliana] 


3.4 
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SEQ 
ID 


ACCESSION 


^iohhnr fRla^tN vs Gc 

DESCRIPTION 


nbank) 
P VALUE 


Nearest Neighbo 
ACCESSION 


r (BlastX vs. Non-Redundant Pro 
DESCRIPTION 


teins) 

P VALUE 


74 


] 

l 

■ 

M57889 


>osophila 

melanogaster 
suppressor of sable 
gene, complete cds. 


4.5 


<NONE> 


<NONE> 


<NONE> 


75 


D49708 


Rattus norvegicus 
mRNA for RNA 
binding protein 


4.5 


<NONE> 


<NONE> 


<NONE> 


76 


D3I853 


Yeast GTS 1 gene for 
glycin-threomn/serine 
repeat protein, 
complete cds 


4.5 


. 2447195 


(U42580) NETTF (7k), DETTS 
(4x) [Paramecium bursana 
Chlorella virus 1] 


3.3 


77 


247036 


Human partial cDNA 
sequence, clone 
bs6l3; 


2.9 


<NONE> 


<NONE> 


<NONE> 


78 


L 19660 


Rattus norvegicus 
gastric inhibitory 
peptide receptor 
mRNA, complete cds 


2.7 


2358279 


(AF00787I) torsinA [Homo 
sapiens] 


2e-07 


79 


X8284I 


A.thaliana Aco gene 


2.6 


483212 


immediate-early protein IE11U * 
human herpesvirus I (strain 
HFENft (fragment) 


8.4 


80 


X61931 


S.purpurascens famA 
and famB genes for 
FAS domain and acyl- 
CoA-dehydragenases. 
respectively 


2.6 


2290534 


(U95031) sublingual gland 
mucin [Homo sapiens] 


n 47 


81 


U13680 


Human lactate 
dehydrogenase-C 
(LDH-C) mRNA, 
complete cds. 


2.5 


• 2887449 


(AB007874) K.IAAU414 [nomo 
sapiens] 


3.1 


82 


AB 007 8 69 


Homo sapiens 
KJAA0409 mRNA, 


? 4 


3130157 


(AB008859) pheromone 
receptor [Fueu rubripes] 


5.4 


83 


X97479 


H.sapiens mas proto- 
oncogene, 5' region 


2.1 


<NONE> 


<NONE> 


<NONE> 


84 


X98374 


R.norvegicus mRNA 
for KIS protein 


1.9 


<NONE> 


<NONE> 


<NONE> 


85 


AE000710 


Aquifex aeolicus 
section 42 of 109 of 
the complete senome 


1 1-9 


<NONE> 


<NONE> 


<NONE> 



[IS 
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SEQ 
ID 


Nearest N< 
ACCESSION 


ei^hbor (BlasiN vs. Ge 
DESCRIPTION 


nbank) 
P VALUE 


Nearest Neighbo 
ACCESSION 


r (BlastX vs. Non-Redundant Pro 
DESCRIPTION 


teins) 

P VALUE 


86 


1 

1 

D30612 


rlomo sapiens mRNA 
For repressor protein, 
jartial cds 


1.9 




<NONE> 


<NONE> 


87 


Y14321 


tfomo sapiens 
PMP69 gene, exons 
8.9.10 & 11 


1.9 


<NONE> 


<NONE> 


<NONE> 


38 


D90773 


E.coli genomic DNA, 
Kohara clone 
#262(30.3-30.5 min.) 


1.9 


1536816 


(D78305) DNA binding protein 
[Chlorella virus] 


7.9 


89 


AE000991 


Archaeoglobus 
fulgidus section 1 16 
of 172 of the • 
complete genome 


1.9 




(X79095) 

pyruvate.orthophosphate 
Hilrinn«p rFlaveria trinervial 


2.7 


90 


U39476 


Rattus norvegicus 
p95 Vav (Vav) proto- 
oncogene mRNA. 
complete cds. 


1.9 




(AL023496) hypothetical 

jruiwiu 


1.6 


91 


U28838 


Human transcription 
factor TFIIIB 90 icDa 
subunit 


1.9 


2495730 


HYPOTHETICAL PkOLlNb;- 
RICH PROTEIN KIAA0269 
>giH665805|gnl|PID|d 10 14089 
(D87459) Similar to Volbox 
carteri extensin (S22697) 
[Homo sapiens] 


023 


92 


U20106 


Rattus norvegicus 
synaptotagmin VII 
mRNA. complete cds. 


1.9 


478380 


UL47h protein - Marek's disease 
virus 


0.23 


93 


AF071010 


Mouse mammary 
tumor virus putative 
integrase, env 
polyprotein, and 
supcrantigen mRNA, 
complete cds 


1.9 


' . 2781386 


(AC004010) similar to Leucine- 
rich transmembrane proteins; 
44% similarity to U42767 
(PID:gl736918) [Homo 
sapiens] 


4e-33 


94 


AF06I881 


Mesocricetus auratus 
c-fos proto-oncogene 
protein (c-fos) gene, 
complete cds 


1.8 


<NONE> 


<NONE> 


<NONE> 


95 


AE001397 


Plasmodium 
falciparum 
chromosome 2, 
section 34 of 73 of 
the complete 
sequence 


1.8 


<NONE> 


<NONE> 


<NONE> 
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SEQ 
ID 


Nearest N< 
ACCESSION 


•iphbor (BlastN vs. Ge 
DESCRIPTION 


nbanlc) 
P VALUE 


Nearest Neighbo 
ACCESSION 


r (BlastX vs. Non-Redundant Pro 
DESCRIPTION 


teins) 

P VALUE 




i 
< 


-lorseshoe crab 
tiRNA for 

;oagu!ation factor B, 


1.8 


<NONE> 


<NONE> 


<NONE> 


97 


] 

M29154 


P.falciparum 
multidrug resistance 
[MDR) gene, 
complete cds. 


1.8 


<NONE> 


<NONE> 


<NONE> 


98 


L16532 


Rattus norvegicus 
(clone pCNPII) 2\3'- 
cyclic nucleotide 3'- 
phosphodiesterase 

complete cds. 


1.8 ' 


« 

<NONE> 


<NONE> 


<NONE> 


99 


AE001434 


Plasmodium 
falciparum 
chromosome 2, 
section 7 1 of 73 of 
the complete 
sequence 


1.8 


<NONE> 


<NONE> 


<NONE> 


100 


Z46785 


D.melanogaster gene 
for protamine 
(mst35Bb). 


1.8 


<NONE> 


<NONE> 


<NONE> 


101 


X69822 


P.sylvestris mRNA 

fnr {jlntnmirie 

svnthetase 


1.8 


219896 


(D90452) l-caldesmon I [Homo 
sapiens] 


9.7 


102 


U49055 


Rattus norvegicus 
CTD-binding SR-like 
protein rA8 mRNA, 
complete cds 


1.8 


2497252 


INSULIN-UKh URUW 1H 
FACTOR BINDING PROTEIN 
4 (IGFBP-4) (IBP-4) (IGF- 
BINDING PROTEIN 4) factor- 
binding protein-4 - sheep 
(fragment) factor-binding 
protein-4, IGFBP-4 [sheep, 
liver, Peptide, 237 aa] [Ovis 
aries] 


2.5 


103 


L28101 


Homo sapiens 
kallistatin (PI4) gene, 
exons 1-4, complete 
cds 


1.8 


4204267 


(AC005223) 55585 
(Arabidopsis thaliana] 


2.4 


104 


U66987 


Pandorina morum 
internal transcribed 
spacer 1, 5.8S 
ribosomal RNA gene, 
and internal 
transcribed spacer 2, 
complete sequence 


1.8 


2635909 


(Z99121) permease [Bacillus 
subttlis] 


1.9 
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SEQ 
ID 


Nearest N 
ACCESSION 


eiphbor (BlastN vs. Ge 
DESCRIPTION 


:nbank) 
P VALUE 


Nearest Neighbc 
ACCESSION 


r (BiastX vs. Non-Redundant Pro 
DESCRIPTION 


teins) 

P VALUE 


105 


X58033 


Human polymorphic 

McnT «ite DNA 

(D3S3 locus) 


1.8 


2L36878 


keratin KAP5.5 - sheep 
(fraamenO >ei|3 13722 


0.65 


106 


U15780 


Human n82 ( ST5) 
mRNA, alternatively 
spliced, complete cds 


1.8 


3638957 


(AC004877) sco-spondin-raucin- 
like; similar to P98 167 uncertain 
[Homo sapiensl 


0.64 


107 


AF038535 


Homo sapiens 
cvnaototasmin VII 
mRNA, partial cds 


1.8 


457927 


(U00690) calcium channel alpha 
1 subunit [Drosophila 
melanosaster] 


0.51 


108 


AF052134 


Homo sapiens clone 
23585 mRNA 
sequence 


1.8 


232263 


HOMEOBOX PROTEIN HOX- 
Dl (HOX-4.9) 


0.28 


109 


X75208 


H.sapiens HEK2 
mRNA for protein 
tyrosine kinase 
receptor. 


1.8 


1730198 


GROWTH- ARREST-SPECIFIC 
PROTEIN 1 gene product 
fHomo sapiensl 


0.22 


HO 


ABO 13896 


Xpnonus laevis 
mRNA for SOX-D, 
complete cds 


1.8 


2494501 


TRANSCRIPTION FACTOR 
FKH-4 factor [Mus musculus] 


0.17 


111 


D 16947 


Human HepG2 3* 
region cDNA, clone 
hmd6bl0 


1.8 


3413870 


(AB007923) KIAA0454 protein 
[Homo sapiensl 


0.002 


112 


D 13547 


Mouse DNA, T early 
alpha (TEA) region 


1.8 


3393018 


(AL031I74) hypothetical 
Drotein 


5e-08 


113 


M35498 


W ' f\r\r\r\\\\r)e C-TYIVC 
VY tKJtlUrllUwK U llljfw 

protein gene, exon 1. 


1.8 


3183405 


HYPU'lHbllLAL 11.3 iOJ 
PROTEIN C2C6.07 IN 
CHROMOSOME I 
>gi|2370504|gnl|PID|e339 194 

pombe] 

>gi|3451305|gnl|PED|el3 16730 
(AL03I324) very hypothetical 
protein [Schizosaccharomyces 
pombel 


8e-10 


114 


M84166 


Hamster c-Ha-ras 
protein gene, 
complete cds. 


1.8 


3386622 


(AC004665) unknown protein 
f Arabidopsis thaliana] 


2e-10 


115 


U33135 


Mychodea camosa 
18S ribosomal RNA 
gene, complete 
sequence 


1.8 


3334982 


(AC005306) R27216J (Homo 
sapiens] 


3e-22 


116 


US4003 


Homo sapiens 
putative tumor 
suppressor (BIND 
gene, exons 7-12 


1.7 


<NONE> 


<NONE> 


<NONE> 
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SEQ 
ED 


Nearest N< 
ACCESSION 


s;iehbor (BlasiN vs. Ge 
DESCRIPTION 


nbank) 
P VALUE 


Nearest Neighbo 


r (BlastX vs. Non-Redundant Pro 
DESCRIPTION 


teins) 
P VALUE 


117 


] 
< 

AE001121 1 


Borrelia burgdorferi 
.section 7 of 70) of 
Jie complete genome 


1.7 


<NONE> 


<NONE> 


<NONE> 


118 


AE001114 


Archaeoglobus 
fulgidus section 1 65 
of 172 of the 
complete genome 


1.7 


<NONE> 


<NONE> 


<NONE> 


119 


U82064 


Angiostrongylus 
cantonensis adult- 
specific muscle 
protein-1 gene, partial 
cds 


1.7 


<NONE> 


<NONE> 


<NONE> 


120 


AF041836 


Buchnera aphidicola 
plasmid pLeu-Sg» 
complete plasmid 
sequence 


1.7 


<NONE> 




^MONE> 


121 


M87479 


Lymnaea stagnalis 
FMRFamide gene, 
mature peptides . 


1.7 


<NONE> 


<riUriC.> 


<^NONE> 


122 


M55163 


Xenopus laevis 
fibroblast growth 
factor receptor 
mRNA, complete cds. 


1.7 


<NONE> 


<NONE> 


<NONE> 


123 


S57565 


histamine H2- 
receptor [rats. 
Genomic. 1928 ml 


1.7 


<NONE> 


<NONE> 


<NONE> 


124 


M27256 


Simian 

immunodeficiency 
virus (SIV) pot 
reeion. 


1.7 


<NONE> 


<NONE> 


<NONE> 




U31516 


Human chromosome 
8 anonymous clone 
pBS8-165 


1.7 


<NONE> 


<NONE> 


<NONE> 


126 


X 12671 


Human gene for 

heterogeneous 

nuclear 

ribonucleoprotein 
(hnRNP) core protein 
Al 


1.7 


<NONE> 


<NONE> 


<NONE> 


127 


AF009054 


Paeonia suttruticosa 
ssp. spontanea 
alcohol 

dehydrogenase IB 
(AdhiB) gene, partia 
cds 


1.7 


<NONE> 


* <NONE> 


<NONE> 



WO 01/02568 



PCT7USO0/18374 





Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















128 


AF046917 


transketolase gene, 
exon 6 and partial cds 


1.7 


<NONE> 


<NONE> 


<NONE> 


129 


D89053 


Homo sapiens mRNA 

synthetase 3, 
complete cds 


1.7 


<NONE> 


<NONE> 


<NONE> 


130 


U57968 


Staphylothermus 

llldi 1ULU> SUJ laic luja- 

associated STABLE 
protease gene, 
complete cds. 


1.7 


<NONE> 


<NONE> 


<NONE> 


131 




OUV 11 1C IlCIpCdVllUd 1 

(clone p95) UL24 
homologue gene, 
complete cds. 


I. / 


\JV% 


<rNONF> 

^1^1 V/l^l 




132 


X04980 


Drosophila simulans 
retrotransposon 297 
5'-LTR and flanks 
(pWK1020) 


1.7 


<NONE> 


<NONE> 


<NONE> 


133 


AE001U4 


A rr*h*»^i™»CTlrthl 1C 
/VI w n uCUgl U U Ua 

fuigidus section 165 
of 172 of the 
complete genome 


1.7 


<NONE> 


<NONE> 


<NONE> 


134 


X04434 


Unmin mt?NlA finr 

insulin*Uke growth 
factor I receptor 


1.7 


<NONE> 


<NONE> 


<NONE> 


135 


U07890 


Mas musculus 
C57BL/6J epidermal 
surface antigen 
(mesa) mRNA, 
complete cds. 


1.7 


<NONE> 


<NONE> 


<NONE> 


136 


D26163 


Human tyrosinase 
gene, 5-flanking 
region cell-specific 
transcription) 


1.7 


<NONE> 


<NONE> 


<NONE> 


137 


AF093818 


Panorpa nipponensis 
NADH 

dehydrogenase 
subunit 5 gene, 
mitochondrial gene 
encoding 
mitochondrial 
protein, partial cds 


1.7 


<NONE> 


<NONE> 


<NONE> 
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Nearest Neighbor (BlastN vs. Gcnbank) 


Nearest Neiehbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Xenftous lae vis 










138 


D50560 


mRNA for 
cytochrome P-450, 
complete cds 


1.7 


<NONE> 


<NONE? 


<NONE> 


139 


AF083488 


Mus muse ul us 
phospholipase D 1 
(PLD1) gene, exons 
18 and 19, complete 
sequence 


1.7 


<NONE> 


<NONE> 


<NONE> 


140 




Mus musculus 
Pontin52 mRNA, 


1.7 


. <NONE> 


<NONE> 


<NONE> 


141 


M73749 


Streptococcus 
salivarius 

thermophilus beta-D- 
galactose (lacZ) gene, 
complete cds. > :: 
gb|M63636|STOJLAC 
ZZ Streptococcus 
thermophilus bcta-D- 
galactosidase (lacZ) 
gene ? complete cds. 


1.7 


<NONE> 


<NONE> 


<NONE> 


142 


AE001 1 14 


Archaeoglobus 
fulaidus section 165 
of 172 of the 
complete genome 


1.7 


2183023 


(U84971) unknown [Homo 
sapiens] 


9.2 


143 


L01983 


Human type IV 
sodium channel alpha 
polypeptide 


1.7 


130504 


UtNUMfc FOLVPRUlblN 
[CONTAINS: N-TERMINAL 
PROTEIN (PI); HELPER 
COMPONENT PROTEINASE 
INCLUSION PROTEIN (CI); 6 
KD PROTEIN 2 (6K2); 
GENOME-LINKED PROTEIN 
(VPG); NUCLEAR ... virus 
(strain D) 


9.2 


144 


L19731 


Plecotus rafinesquii 
mitochondrial 
cytochrome b gene. 5; 
end. 


1.7 


3327096 


(AB014541) KIAA0641 protein 
[Homo sapiens] 


9.1 


145 


AE001 1 14 


Archaeoglobus 
fulgidus section 165 
of 172 of the 
complete genome 


1.7 


2183023 


(U84971) unknown [Homo 
sapiens] 


8.8 



(Si 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Pro 


teins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 


146 


L27218 


Bos laurus serum 
amine oxidase 
mRNA, complete cds. 
> oxidase=amiloride- 
binding protein 
homolog [cattle, liver, 
mRNA. 2664 ml 


1.7 


1174459 


SIGNAL TRANSDUCER AND 
ACTIVATOR OF 
TRANSCRIPTION 6 (IL-4 
STAT) >gi|559855 (U16031) IL 
4 Stat [Homo sapiens] 


7.1 


147 


Z49868 


Caenorhabditis 
elegans cosmid 
W07E1 1, complete 
sequence 
[Caenorhabditis 
eleeans] 


1.7 


4204263 


(AC0O5223) 
[Arabidopsis thaliana] 


6.7 


148 


AL022271 


Caenorhabditis 
elegans cosmid 
F32F2, complete 
sequence 
[Caenorhabditis 
eleaans] 


1.7 


2497969 


PFRIPLASMIC NITRATE 
REDUCTASE PRECURSOR 
>gi|l086l07|pir||S5Gl63 nitrate 
reductase large chain precursor, 
penpiasmic - iniubpiiaciu 
pantotropha >gi|6O0093 
(236773) periplasmic nitrate 
reductase large subunit 
[Paracoccus denitrificans] 


6.7 


149 


U43844 


Mus rnusculus cyclin 
D3 gene, complete 
cds 


1.7 


3861490 


(AF062037) capsid protein 
precursor [Thosea asiena virus] 


5.1 


150 


Z25464 


S.cerevisiae UNF1, 
LTVL MRP8, CYB3 
and TGL1 genes, 
complete CDS's 


1.7 


1255404 


(U5315I) weak similarity to 
cytochrome b [Caenorhabditis 
eleeans] 


4.1 


151 


U77846 


Human el as tin gene, 
partial cds and partial 
3'UTR 


1.7 


3355682 


(AL031 124) putative secreted 
Ivase 


4.0 


152 


X62880 


S.scrofa mRNA for 
calcium release 
channel (CRC) 


1.7 


3327080 


(ABO 14533) KIAA0633 protein 
[Homo sapiens] 


4.0 


153 


Y00067 


Human gene for 
neurofilament subunit 
M (NF-M) 


1.7 


. 479829 


heterogeneous ribonuclear 
particcl protein homolog - 
Caenorhabditis elegans 
similarity to RNA recognition 
motifs [Caenorhabditis elegans] 


3.9 
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SEQ 
ID 


Nearest N 
ACCESSION 


eiehbor(BlastN vs.Ge 
DESCRIPTION 


nbank) 
P VALUE 


Nearest Nei*hbo 
ACCESSION 


r fBIastX vs. Non-Redundant Pro 
DESCRIPTION 


teins) 
P VALUE 


154 


X68393 


D.melanogaster gene 
for Beta-rubulin, 
exons 1 and 2 


1.7 


2342682 


( AC000106) Contains similarity 
to Rattus AMP-activated protein 
kinase (gb|X95577). 
Arabidopsis thalianal 


3.8 


155 


ABO 12284 


Shuttle vector 
pAUR123 gene for 
Aur.l-C complete cds 


1.7 


417704 


POL POLYPROTEIN 
(ORFIA/1B) [CONTAINS: 
RNA-DIRECTED RNA 
POLYMERASE ; HELICASE; 
PROTEASE ] 


3.8 


156 


M96633 


Rattus norvegicus 
mitochondrial 
intermediate 
peptidase (MIP) 
mRNA. complete cds. 


1.7 


2314209 


(AE000613) H. pylori predicted 
codine region HP 1054 


3.1 


157 


U49055 


Ractus norvegicus 
CTD-binding SR-iike 
protein rA8 mRNA, 
complete cds 


1.7 


2497252 


INSULT-LIKE bkUWIH 
FACTOR BINDING PROTEIN 
4 (IGFBP-4) (IBP-4) (IGF- 
BINDING PROTEIN 4) factor- 
binding protein-4 - sheep 
(fragment) factor-binding 
protein-4, IGFBP-4 [sheep, 
liver, Peptide, 237 aa] [Ovis 
aries] 


3.0 


158 


Y 15907 


Mas musculus mRNA 
for myc-intron- 
binding protein- 1 


1.7 


912776 


iduronate-2-sulfatase, IDS {EC 
3.1.0. 1 j ) reptifle muLani, 
aa] 


3.0 


159 


U67600 


Methanococcus 
jannaschii section 142 
of 150 of the 
complete genome 


1.7 


2982355 


(AF052252) fork head domain 
protein FKD9 IDanio rerio] 


3.0 


160 


AFO 13759 


Homo sapiens 
catumein (Calu) 
mRNA, complete cds 


1.7 


2982355 


( AF052252) fork head domain 
protein FKD9 fDanio rerio] 


2.9 
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Nearest Neighbor (BiastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Pr< 


steins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















161 


AF062915 


Arabidopsis thaliana 
putative transcription 
factor (MYB90) 
mRNA. complete cds 


1.7 


3878065 


IIuuianuiRNA ytuduU 

KIAA0077 (TR:Q 14997); 
cDNA EST yk243h8.5 comes 
from this gene; cDNA EST 
yk243h8.3 comes from this 
gene; cDNA EST yk359h4.5 
comes from this gene 
[Caenorhabditis elegansl 
>gi|38803 1 8|gnl|PID|e 1 349839 
(Z8 1 1 33) Similarity to Human 
mRNA product KIAA0077 
(TR:Q 14997); cDNA EST 
yk243h8.5 comes from this 
gene; cDNA EST yk243h8.3 
comes from this gene; cDNA 
EST yk359h4.5 comes from this 
gene 


2.3 


162 


X87526 


H.sapiens genomic 
DNA (chromosome 
3; clone NL3003R) 


1.7 


3638957 


(AC004877) sco-spondin-mucin- 
like; similar to P98167 uncertain 
[Homo sapiensl 


2.3 


163 


AC005573 


Homo sapiens 
chromosome 5, PAC 
clone 202ei3 


1.7 


2465540 


(AF005632) phosphodiesterase 
I/nucleotide pyrophosphatase 
beta [Homo sapiens] 


1.8 


164 


D83402 


Homo sapiens gene 
for prostacyclin 
synthase, exon 10 and 
complete cds 


1.7 


627608 


steroid hormone receptor TR3 - 
human sapiens] 


1.7 


165 


AF053700 


Homo sapiens deltex 
(Dx) mRNA, 
complete cds 


1.7 


2662089 


(AB007864) KIAA0404 [Homo 
sapiens] 


1.7 


166 


AF043225 


Mus musculus 6- 
pyruvoyl- 
tetrahydropterin 
synthase (Pts) 
mRNA. complete cds 


1.7 


2352538 


(AF006564) alcohol 
dehydrogenase [Drosophila 
persimilis] persimilis] 


1.4 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















167 


U52917 


Inermus aquaucui 
thermophilus NADH 
dehydrogenase [ 
subunits NQ07 
NQ06, NQ05, 
NQ04. NQ02, 
NQ01.NQ03, 
NQ08, NQ09, 
NQO10.NQO11. 
NQ012, NQ013,and 

cds. 


1.7 


2564334 


(AB006631)The human 
homfiloc of mouse Cux-2 
[Homo sapiens] 


1.0 


168 


X72222 


M.musculus gene for 
serotonin 2 receptor 


1.7 


3875796 


^/j40) similarity to least 
hypothetical YIK9 protein 
(SW:YIK9_YEAST); cDNA 
EST EMBL:T01252 comes 
from this gene; cDNA EST 
EMBL:D33205 comes from this 
gene; cDNA EST 
EMBL:D33955 comes from this 
gene; cDNA EST 
EMBL;D35484 co... 


1.0 


169 


U23186 


Crotalus scutulatus 

PLA2-like 

pseudogene 


i.7 


853971 


(XS3413) DR5 [Human 
herpesvirus 6] >gi|853972 
(X83413) DR5 [Human 
herpesvirus 61 


0.99 


170 


M83U8 


Mus musculus factor 
VHI-associated 
protein (f8a) mRNA, 
complete cds. 


1.7 


3201617 


(AC004669) hypothetical 
protein [Arabidopsis thaliana] 


0.80 


171 


M38347 


Exoli ATP- 
dependent proteinase 
(Ion) gene, complete 
cds. 


1.7 


4140322 


IAL031282) dimHi SI (Cell 
Division Cycle 2-Like 2 
(PITSLRE, p58/GTA, 
Galactosyltransferase 
Associated Protein Kinase)) 
(isoform beta 2-2) [Homo 
sapiens] 


0.78 


172 


U2S838 


Human transcription 
factor TFIIIB 90 kDa 
subunit 


1.7 


2495730 


HYPOTHETICAL PJtOLINt- 
RICH PROTEIN KIAA0269 
>gi|1665805|gn!|PID|dl014089 
(D87459) Similar to Volbox 
carteri extensin (S22697) 
[Homo sapiens) 


0.62 



!55 
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Nearest Neighbor fBlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















173 


U72487 


Rattus norvegicus 
calcium-independent 
alpha- latrotoxin 
receptor mRNA, 
complete cds 


1.7 


544411 


GLYCOPROTEIN GP100 
PRECURSOR (P29F8) 
discoideum] 


0.35 


174 


AE000718 


Aquifex aeolicus 
section 50 of 109 of 
the complete genome 


1.7 


2497569 


FIBROBLAST GROWTH 
FACTOR RECEPTOR 3 
PRECURSOR (FGFR-3) 
(HEP ARIN-B INDING 
GROWTH FACTOR 
RECEPTOR) 
>gi|2117851|pir||I55363 

3 - mouse >gi|199145 (M81342) 
fibroblast growth factor receptor 
3 [Mus musculus] 


0.34 


175 


AF016897 


Oryza sativa GDP 
dissociation inhibitor 
protein OsGDI2 
(OsGDI2) mRNA, 
complete cds 


1.7 


125362 


MALKUr'HAUfc COLUN Y 
STIMULATING FACTOR I 
RECEPTOR PRECURSOR 
(CSF-l-R)(FMSPROTO- 
ONCOGENE) (C-FMS) factor 1 
receptor - cat >gi| 163855 
(J03 149) M-CSF receptor [Felis 
domesticus] 


0.34 


176 


U95102 


mitotic 

phosphoprotein 90 
mRNA, complete cds 


1.7 


85058 


inUbCtiJ llIlL aLCIYlvHUlUIC 

receptor - fruit fly acetylcholine 
receptor [Drosophila 
melanoaaster] 


0.20 


177 


AF077352 


Chlamydomonas 
reinhardtii myosin 
heavy chain 


1.7 


728901 


ACROSOMAL WKMHN SP- 
10 PRECURSOR SP- 10- 
western baboon 
>gi|298488|bbs|127113 
(S56458) SP-10=intraacrosomal 
protein [Papio papio=baboons, 
Peptide. 285 aa] [Papio 
hamadryas] 


0.20 


178 


Z92788 


Caenorhabditis 
eiegans cosmid 
F53B8, complete 
sequence 
'Caenorhabditis 
eleeans] 


1.7 


746516 


(U23517)D1022.7 
Caenorhabditis eiegans] 
>eil3258651 eiegans] 


0.068 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor CBIasiX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












(X878831 mitochondrial parsnip 




179 


AF0O2217 


Ralstonia eutropha 
megaplasmid pHGl 
nitric oxide reductase 
fnorB^ t?ene 
complete cds 


1.7 


1143538 


selenoprotein [Ratrus 
norvegicus] >gi| 1354 135 
(U48702) mitochondia 
associated cysteine-rich protein 
SMCP 


0.039 


180 


D30749 


Rat mRNA for 
protein tyrosine 
phosphatase 


1.7 • 


1228035 


(D83776) The KIAA0191 gene 
is expressed ubiquitously.; The 

T(TT A AOIQl INrAtAin rati « *«r* + L« a 

iiAr\j\\jLyi protein retains tne 
C2H2 zinc -finger at its N- 
terminal region. [Homo sapiens] 


0.008 


181 


M15202 


Rat fast skeletal TnT 
gene encoding 
troponin T isoforms, 
complete cds. 


1.7 


731172 


SKIN SECRETORY PROTEIN 
XP2 PRECURSOR 


4e-04 


182 


L07592 


Human peroxisome 
proliferator activated 
receptor mRNA, 
complete cds. 


1.7 


4033414 


PUTATIVE IMPORTIN BETA- 
4 SUB UNIT 


2e-06 


183 


U64031 


Dendrobium 
crumenatum ACC 
synthase gene, 
complete cds 


1.7 


3122885 


ASPARTYL-TRNA 
SYNTHETASE synthetase 
[Bacillus subtilis] 


2e-ll 


184 


AF034970 


Homo sapiens 
docking protein 
(DOK-2) mRNA, 
complete cds 


1.7 


2289097 


(U78737) 

alpha( l,3)fucosyltransferase 
[Cricetulus griseus] 


8c- 12 


185 


Z12839 


L.longirlorum mRNA 
encoding calmodulin. 

> :: 

gb|Li89I2|LILCALM 
ODU Lilium 
ongiflorum 
calmodulin mRNA, 
:omplete cds. 


1.7 


2511747 i 


AF023270) probable 
xanscriptional regulator drc4 


4c- 12 



CS-7 
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Z Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundani Proteins) 


SEQ 
ID 


ACCESSION 


I DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















186 


X53459 


Equine arteritis virus 
(EAV) RNA genome 

> :: 

emb|A45589|A45589 
Sequence 1 from 
Patent W09519438> 

emb|A58849|A58849 
Sequence 1 from 
Patent WO9700963 > 

gb|AR013959|AR013 
959 Sequence 1 from 
patent US 5773235 


1.7 


3979817 


- (270633) Wuik Mniilaiiij lu — 
Human tyrosine-protein kinase 
CSK (SW;CSK_HUMAN); 
cDNA EST EMBL.-C 10908 
comes from this gene; cDNA 
EST EMBL:C12822 comes 
from this gene; cDNA EST 
yk408c2.3 comes from this 
gene; cDNA EST yk408c2.5 ... 
Human tyrosine-protein kinase 
CSK (SW:CSK_HUMAN); 
cDNA EST EMBL:C 10908 
comes from this gene; cDNA 
EST EMBL:C12822 comes 
from this gene; cDNA EST 
yk408c2.3 comes from this 
gene; cDNA EST vk408c2.5 ... 


le-14 


187 


K02668 


E. coli ddl gene 
encoding D-aJaninerD 
alanine ligase and 
ftsQ and ftsA genes, 
complete cds, and 
ftsZ gene, 5' end. 


1.7 


3879121 


(Z703 10) predicted using 
Genefinder; Similarity to Mouse 
ankyrin (PIR Acc. No. S37771); 
cDNA EST EMBLT01923 
comes from this gene; cDNA 
EST EMBL:D32335 comes 
from this gene; cDNA EST 
EMBL:D32723 comes from this 
gene; cDNA ES... Genefinder; 
Similarity to Mouse ankyrin 
(PIR Acc. No. S37771); cDNA 
EST EMBLT01923 comes 
from this gene; cDNA EST 
EMBL:D32335 comes from this 
gene; cDNA EST 
lmdl.l'jz /^, j comes rrom cnis 
gene; cDNA ES... 


2e-19. 


188 


AB008375 i 


Homo sapiens mRNA . 
for osteoblast specific 
:ysteine-rich protein, 
:omplete cds 


1.7 


( 

( 

2496945 1 


HYPOTHETICAL 55.9 KD 
PROTEIN EEED8.6 IN 
CHROMOSOME II >gi|733603 
U23484) No definition line 
bund [Caenorhabditis ele^ans] 


le-19 


189 


J 
( 
t 
( 

L36603 c 


^seudomonas cepacia 
clone Psudom70-1) 
leat shock protein 70 
hsp70) gene, 
omplete cds 


1.7 


( 

2661842 f 


Y 15732) DNA polymerase beta 
Xenopus laevis] 


6e-20 



/St 
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Nearest Neiehbor (BlastN vs. Gcnbank) 


Nearest Neiehbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ED 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












HYPOlHbULAL O.0 itL) 




190 


£49760 


P.blakesleeanus 
mRNA GTP 
cyclohydrolase I 


1.7 


1731181 


PROTEIN CI4A4.3 IN 
CHROMOSOME U 
>gi|3874230|gnJ |PID|e 1 35 1 6 1 8 
protein (Swiss Prot accession 
number P38376); cDNA EST 
yk220el0.5 comes from this 
gene [Caenorhabditis elegans] 


3e-21 


191 


U52428 


Human Tatty acid 
synthase gene, partial 
cds 


1.7 


* 4226073 


(AF 125443) contains similarity 
to S. pombe phosphatidyl 
synthase (GB:Z28295) 
[Caenorhabditis elegans] 


6e-25 


192 


U12767 


Human mitogen 
induced nuclear 
orphan receptor 


1.6 


<NONE> 


<NONE> 


<NONE> 


193 


Z63478 


H-sapicns CpG DNA. 
clone 85a 12. forward 
read cpg85al2.ftla . 


1.6 


<NONE> 


<NONE> 


<NONE> 


194 


AF084375 


Homo sapiens 
inversin protein, 
exons 8 and 9 


1.6 


<NONE> 


<NONE> 


<NONE> 


195 


AE001U4 


Archaeoglobus 
fulgidus section 165 
of 172 of the 
complete genome 


1.6 


<NONE> 


<NONE> 


<NONE> 


196 


AF084375 


Homo sapiens 
inversin protein, 
exons S and 9 


1.6 


<NONE> 


<NONE> 


<NONE> 


197 


U24217 


Kluyveromyccs lactis 
RNA polymerase II 
argest subunit gene, 
partial cds 


1.6 


' <NONE> 


<NONE> 


<NONE> 


198 


AE000580 > 


Helicobacter pylori 
26695 section 58 of 
134 of the complete 
zenome 


1.6 


<NONE> 


<NONE> 


<NONE> 



WO 01/02568 



PCT/US00/18374 



1 SEQ 
1 ID 


Neares 
ACCESSI01 


[Neighbor (BlastN vs. 
V DESCRIPTION 


Genbank) 
P VALUE 


Nearest Neighb 
ACCESSION 


3r (BlastX vs. Non-Redundant Proteins) 

DESCRIPTION PVATTTP 


1 199 1 X62083 
200 1 M28064 


H.sapiens mRNA fo 
Drosophila female 
sterile homeotic 
(FSH) homologue > 
gb|M80613|HUMFS 
HG Human homolog 
of Drosophila female 
sterile homeotic 

mRNA, complete cds 
Plasmodium 

brasilianum DNA 

homologous to the 

histidine-rich knob 

protein region of 

Plasmodium 

falciparum. 


* i 

1-6 I <NONE> 

(i 

1-6 457495 p 


<NONE> <NONE> 

VI26647) ORF X 

SaccharomYces cere vis irwO 1 0 a 


201 1 UQ3U4 

202 J U88422 


Streptomyces albus 
lipase precursor (lip) 
gene, complete cds. 
and unidentified 5* 
ORF and 3* ORF, 
partial cds. 

Strix varia oocyte 
maturation factor 

Mos ( C-mnO nmtft. 

oncogene, partial cds 


1 (AC004877) sco-spondin-mucinJ 
1 like; similar to P98 167 uncertain 
16 J 3638957 porno sapiens! | 7 ,g 

1 VITAMIN D3 RECEPTOR 
J (VDR) receptor [Rattus 
1*6 1 137618 Inorveoicus] | 54 


1 1 

1 ' 

203 M68519 c 

f I 

1 1 

204 J AF044575 F 


iuman pulmonary 
surfactant-associated 
protein SP-A 
SFTPl) gene t 

:omplete cds. 

iomo sapiens 
ranscription factor 
'OU4F3 


j (Z38I12)E03A3.6 
L- 6 [ 3875423 [Caenorhabditis elegans] | 4.9 

1 GAB A transport protein - 
16 J 2133625 tobacco hornworm [ 47 


1 1 

( 

1 F 

205 1 L48476 s 


lomo sapiens 
subclone 3_el0 from 
1H21)DNA 
equence. 


1 (AJ005588) 5-epi-aristolochene 
1.6 3687297 svnthase | 46 


1 R 

1 n 

[ 206 | Ml 8630 fp 


at CNS 2\3'-cyclic 
ucleotide 3- 
ipsphodiesterase 


1 (Z81 133) Similarity to Human 1 

mRNA product KIAA0077 

(TR;Q14997) [Caenorhabditis - 
1-6 1 3880315 elewnQi 37 



f 0 



WO 01/02568 



PCTAJS00/18374 





Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ED 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















207 


AF027174 


Arabidopsis thaliana 
cellulose synthase 

B) mRNA, complete 
cds 


L6 


267068 


TUMOR-ASSOCIATED 
ANTIGEN L6 


3.6 


208 


U53448 


OaUwIa lillUlUll IlCut 

shock protein 70 
(hsp70) gene, 
complete cds 


1.6 


1255429 


(U53155) strong similarity to 
the carboxyl two-thirds of valyl- 
tRNA synthetases 
[Caenorhabditis elegans] 


2.2 


209 


AF084367 


Homo sapiens 
inversin protein 
mRNA, complete cds 


1.6 


1730076 


PROBABLE 

ccd rMC/TTrDTTAKrnsjp 

oCKllNC/ 1 flisiVj'tlllN-t- 

PROTEIN KINASE CY49.28 
>gi| I j /U2!)5|gnl|rUJ|e24 /U94 
(Z73966) pknJ 


1.2 


210 




Yeast disl+ gene for 
p93disl, complete 
cds 


1.0 


3128353 


(AF0 10496) maltose transport 
inner membrane protein 


1.2 


211 


AF035756 


Streptomyces sp. 2- 
dehydro-3- 
deoxyphosphohepton 
ate aldolase gene, 
partial cds 


1.6 


853971 


(X83413)DR5 [Human 
herpesvirus 6] >gi|853972 
(X8j413) DRd [Human 
herpesvirus 6] 


0.97 


212 


X73479 


O.cuniculus rPTPA 
mRNA 


1.6 


3413810 


(Y17034) Bassoon [Mus 
musculus] 


0.94 


213 


X98330 


H.sapicns mRNA for 
ryanodine receptor 2 


1.6 


2072986 


(U95142) putative G-protein- 
coupled receptor G-protein- 
coupied receptor [Arabidopsis 
thaliana] 


0.73 


214 


X64194 


P.anserina FMR1 
gene exons 1 and 2 


1.6 


128014 


NECDIN >gi|9l l29|pir||JN0148 
nccdin, brain - mouse 
>gi|200020 (M80840) necdin 
[Mus musculus] 


0.42 


215 


Z92788 


Caenorhabditis 
elegans cosmid 
F53B8, complete 
sequence 
[Caenorhabditis 
elegansl 


1.6 


746516 


(U235I7)D1022.7 
Caenorhabditis elegans] 
>gi|3258651 elegans] 


0.19 


216 


AE000888 < 


Viethanobacterium 
therrnoautotrophicum 
from bases 1098908 
to 1112186 (section 
94 of 148) of the 
:omplete genome 


1.6 


] 

462415 1 


INTERFERON- ALPHA/BETA 
RECEPTOR ALPHA CHAIN 
PRECURSOR (EFN-ALPHA- 
REC) >gi|346520|pir||S27387 
nterferon alpha receptor type 1 - 
sovine >gi|432 


0.001 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 



217 



218 



219 



Nearest Neiphbor fBlastN vs. Genbank) 



ACCESSION 



AB008375 



M25312 



AB012882 



DESCRIPTION 



P VALUE 



Homo sapiens mRNA 
for osteoblast specific 
cysteine-rich protein 
complete cds 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



DESCRIPTION 



1.6 



2496945 



Orang-utan involucrin 
gene, complete cds. 



Cyprinus carpio 
mRNA for MyoD, 
complete cds 



1.6 



3875131 



1.5 



<NONE> 



P VALUEl 



HYPOTHETICAL 55.9 KD' 
PROTEIN EEED8.6 IN 
CHROMOSOME II >gi|733603 
(U23484) No definition line 
found [Caenorhabditis elegans] 



(Z70750) similar to vanadate 
resistance protein 
transmembranous domains 
[Caenorhabditis elegansl 



<NONE> 



le-18 



3e-26 



<NONE> 



220 



U29487 



Caenorhabditis 
elegans cosmid 
C09C7 



1.5 



<NONE> 



<NONE> 



<NONE> 



221 



222 



223 



X74760 



M.musculus mRNA 
for Notch 3 



1.5 



1364094 



integral membrane protein - 
Streptomyces pristinaespiralis 
>gi|872306 (X84072) integral 
membrane protein 

Streptomyces pris tinaespiralis 1 
fcXUULUuAjNAilt 11 " 



4.3 



U72396 



Lycopersicon 
esc u I en turn class II 
small heat shock 
protein Le-HSPl7.6 
mRNA, complete cds 



1.5 



121855 



PRECURSOR cellulose 1,4-beta 
cellobiosidase (EC 3.2.1.91) D 
precursor - fungus (Trichoderma 
reesei) 1,4-beta-cellobiosidasc 
(EC 3.2.1.91) II -fungus 
cellobiohydrolase II 
Trichoderma reesei] 



U42391 



Human myosin- IXb 
mRNA, complete cds 



\5 



3688428 



(AJ01 1534) sucrose synthase 



4.3 



224 



225 



M92296 



Pongo pygmaeus 
gamma- 1 and gamma 
globin genes, 
iplcte cds. 



comi 



1.5 



186413 



X94144 



C japonica mRNA for 
QNR-71 protein 



1.5 



2745737 



(M13l44)inhibin A (Homo 
sapiens] 



(AF02979I) UDP- 
Gai:betaGlcNAc beta 1,3- 
galactosyltranferase-II [Mus 
musculus] 



0.22 



WO 01/02568 



PCT/US00/18374 





Nearest Neiehbor {BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ED 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















226 


ABO 14557 


Homo sapiens mRNA 
for KIAA0657 
protein, partial cds 


1.5 


1212992 


(X90568) Protein sequence and 
annotation available soon via 
Swiss-Prot; available at present 
via e-mail from 
LABEIT@EMBL- 
Heidelberg.DE [Homo sapiens] 


4e-13 


227 


AF000948 


Rorrplia htiroHnrfpn 

oligopeptide 
permease homolog 

V-/ppA.i V ^Opp/\l V J 

gene, complete cds 


1.3 


<none> 


<NONE> 


<NONE> 


228 


AF057287 


Mus musculus 
RAB/Rip protein 
mRNA, partial cds. 


1.3 


2498005 


MYC PROTO-ONCOGENE 
PROTEIN (C-MYC) proto- 
oncogene [Sus scrofa] 


2.6 


229 


U38951 


urosopnuu 
melanogaster 
vacuolar A I Hose 
subunit E 


1.1 


<NONE> 


<NONE> 


<NONE> 


230 


AF027148 


Homo sapiens 
myogenic 

determining factor 3 


1.1 


3172134 


(U90209) RNA polymerase II 
largest subunit [Bonnemaisonia 
hamifera] 


2.3 


231 


AF079310 


Mus musculus histone 
deacetylase 3 
(Hdac3) gene, exons 
4 through 15 and 
complete cds 


1.0 


1657601 


(U66220) unknown 
[Nannocystis exedens] 


0.25 


232 


X52I34 


P.radiata lac gene for 
laccase 


0.95 


996020 


gallus] 


0.31 


233 


D89016 


Human mRNA for 
Neuroblastoma, 
complete cds 


0.93 


<NONE> 


<NONE> 


<NONE> 


234 


X76392 


C.familiaris VIP36 
(vesicular integral- 
membrane protein of 
36 kDa) mRNA 


0.93 


4176446 


(AL022238)dJ1042K10.2.1 
[novel protein with probable 
rabGAP domains and Src 
homology domain 3) 


7e-8I 


235 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


0.90 


<NONE> 


<NONE> 


<NONE> 



WO 01/02568 PCT/US00/18374 



ai^g- Nearest Neighbor (BlastN vs. Genbank) 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



SEQ 
ID 



ACCESSION 



DESCRIPTION 



P VALUE I ACCESSION 



DESCRIPTION 



EGTTFKOTETNTRECURSORT 



P VALUE 



236 



AE000991 



Archaeoglobus 
fulgidus section 116 
of 172 of the 
complete genome 



0.90 



237 



Z35922 



Sxerevisiae 
chromosome II 
reading frame ORF 
YBR053c 



1176579 



0.86 



<NONE> 



(LAKLV (Jl TRANSCRUT 2)" 
>gi|1362345|pir||S55862 
probable membrane protein 
YNL327w - yeast 
(Saccharomyces cerevisiae) 
cerevisiae] 

>gi|1302445|gnl|PID|e239572 
(271603) ORF YNL327w 
[Saccharomyces cerevisiae] 



<NONE> 



6.9 



<NONE> 



238 



U47331 



Rattus norvegicus 
metabotropic 
glutamate receptor 4b 
mRNA, complete cds 



0.82 



1550703 



(Z80225) hypothetical protein 
Rv2662 



4.1 



239 



X72810 



H.sapiens Ig germline 
kappa-chain gene 
variable region (L3) 



0.69 



3023063 



(AF052587) FI4 [Xylella 
fastidiosa] 



6.7 



240 



Z11700 



Escherichia coli 
genes facG, faeH, 
fael, faeJ and IS629- 
litce insertion 
sequence. > :: 
emb|ZI1710|ECFAE 
HU E.coli faeH, fael 
and faeJ genes 
encoding FaeH, Fael 
and FaeJ proteins 



0.69 



2347188 



(AC002338) laccase isolog 
[Arabidopsis thaliana] thaliana] 



3.9 



241 



U71597 



Phrynosoma 
douglassii NADH 
dehydrogenase 
subunit 4 (NEW) 
gene, mitochondrial 
gene encoding 
tochondrial 
jrotein, panial cds 



0.65 



<NONE> 



<NONE> 



<NONE> 



1M 



WO 01/02568 



PCTYUS00/18374 





Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-RerfnnHnnr Pr^in^ 1 


SEQ 
ID 


ACCESSION DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


p value! 


I [Ammonia species 
1 iLSUrRNA gene 
1 (partial; isolate Tr S 
242 1 Z77798 5; clone 16^ 


0.64 J 1174506 


bLUi'AMVL-lRNA " 
iVNiHhTAiibglutamate-- 
tRNA ligase (EC 6.1.1.17)- 
Haemophilus influenzae (strain 
Rd KW20) >gi|1573240 
(U32713) glutamyl-tRNA 
synthetase (gltX) (Haemophilus 
influenzae Rd] 


1.2 1 


I [Human mRNA for 
1 golgi antigen gcp372, 
. 243 1 D25542 complete cds 


0-64 IH230 


ultra-high-sulfur keratin I - 
mouse 


le-05 1 


1 [Low dopamine 
1 J transporter mRNA, 
244 I M80234 putative cds. 


0-64 1 3874972 


(Z99709) similar to Elongation 
factor Tu family (contains 
ATP/GTP binding P-Ioop); 
cDNA EST EMBL.-D76223 
comes from this gene; cDNA 
EST yk478c5.5 comes from this 
sene fCaenorhabditis eleaanci 




1 [Homo sapiens mRNA 
I for KIAA0449 
245 j AB007918 protein, partial cds 


0-64 1 2833239 


Epidermal growth 

FACTOR RECEPTOR 
KINASE SUBSTRATE EPS8 
>gi|530823 (U12535) epidermal 
growth factor receptor kinase 
substrate [Homo sapiens] 


2e-I4 I 


1 1 Human U266 
1 [rearranged DNA for 
1 [lambda- 
1 (immunoglobulin light 
246 X51754 chain 


1 j 
I 

0.63 1 2072301 


[U95102) mitotic 
Jhosphoprotein 90 [Xenopus 
aevis] 


1.5 1 


1 [Helicobacter pylori, 
I strain J99 section 115 
1 of 132 of the 
247 1 AE001554 [complete eenome 


0.62 J <NONE> 


<NONE> 


cNONE>| 


1 H.sapiens CpG DNA, 
1 clone 96e7, reverse 

,_248 1 264067 read cpp96e7.rtla . 
1 [Pinus sylvestris 
I microsatellite DNA, 

.249 | AJ223768 [clone S?AC 1 1 .5 


0-62 1 <NONE> 
__0^2_|__<NONE>__ 


, <NONE> < 

, <NONE> < 


:NONE>| 
NONE>| 



WO 01/02568 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















250 


AJOl 1592 


Bacteriophage PI ban 
gene 


0.62 


2493689 


Puri'Tri'^'VQTfciu "it \Y\ VT\ 

i nU 1 Uj IjI CJVl Li LU ISJJ 

PHOSPHOPROTEIN deltoides] 
>gi|2 14332o|gnl|PID|e3 19090 
(Y 13328) lOkDa 
phosphoprotein [Populus 
deltoides] 


7.9 


251 


ArUz/ iJl 


Xcnopus laevis 
survival \ji munji 
neuron protein 
interacting protein 1 
(SEPl)mRNA, 
complete cds . .. . 


0.62 


4007790 


(AL034463) putative single- 
strand polynucleotide binding 
protein [Schizosaccharomyces 
pombe] 


2.0 


252 


AJ00O376 


Helobdella thserialis 
mRNA for actin 


0.62 


1117968 


(U40763) CARS-Cyp [Homo 
sapiens] sapiens] 


0.90 


253 


\ f ami i 


Rat thymosin beta 4 
gene (pTB4G).intron. 


0.62 


4176370 


(AC005058) similar to calcium- 
independent phospholipase A2; 
similar to AC004392 
(PID:g3367519) [Homo 
sapiens] 


6e-51 


254 




Homo sapiens X 1 IL2 
mRNA for XI 1-like 
protein 2, complete 
cds 


0.61 


<NONE> 


<NONE> 


<NONE> 


255 


D26470 


Bacteroides 
gingivalis DNA for 
arginyl 

endo peptidase, 
complete cds 


0.61 


<NONE> 


<NONE> 


<NONE> 


256 


J04737 


AJhaliana ATPase 
gene, complete cds. 


0.61 


<NONE> 


<NONE> 


<NONE> 


257 


U06756 


Bos taurus clone 
bml308 

microsatellite and are- 
Ip repeat region. 


0.61 


1922280 


(Y09905) snaii like protein 
Gall us gall us] 


0.51 


258 


S75756 


pl5=cyclin D- 
dependent kinases 4 
and 6-binding 
protein/pi 5 product 
[exon/intron 1 } 
human, brain tumors, 
Genomic, 753 nt] 


0.61 


484938 


hypothetical protein 253 - 
Streptomyces griscus plasmid 
pSGl (fragment) 


0.13 


259 


] 

L39837 < 


Drosophila 
melanogaster tumor 
supressor (warts) 
tiRNA exons 1-8, 
:omptete cds. 


0.61 


i 

3875131 


[Z70750) similar to vanadate 
'esistance protein 
xansmembranous domains 
Caenorhabditis elegans] 


le-09 



WO 01/02568 



PCT7US00/18374 



ED 


| Nearesi 
ACCESSIOI 


Neighbor (BlastN vs. 
ST DESCRIPTION 


Genbank) 1 Nearest Neigh 
P VALUE 1 ACCESSION 


bor (BlastX vs. Non-Redundant Proteins) 

DESCRIPTION p VALUE 


260 
261 


U52428 
X 15292 


Human fatty acid 
synthase gene, partia 

_ cds 

Plasmodium 
falciparum gene for 
heat-shock protein 
pPf203 


1 

0-61 1 4226073 
0.60 1 <NONE> 


(API 25443) contains similarity | j 
to S. pom be phosphatidyl | 
synthase (GB:Z28295) 
_ [Caenorhabditis departs] | 2e-26 1 

<NONE> <NONE> J 


262 


AB 020663 


Homo sapiens mRNA 
for KIAA0856 
protein, panial cds 


0.60 470341 


(U00043) No definition line j 
found [Caenorhabditis elegansl 1 5 7 1 


263 


U68723 


Human checkpoint 
suppressor 1 mRNA, 
complete cds 


0.60 544375 


PROTEIN REGULATOR 1 
glucose/galactose binding 1 
protein regulator - I I 
Agrobacterium tumefaciens 1 
>gi|142228 (L10424) j 
glucose/galactose binding I 
protein regulator 1 57 I 


264 


M32687 


S.griseus spoliation 
protein genes 1590 
and 1422. 


0-60 2582017 


<AF012871)Mergla f [Mus j 
musculus] 33 1 


265 


AJ005331 


Homo sapiens 
NKCC2 gene, exon 4, 
isoform B 


0-60 J 3128353 


(AF010496) maltose transport ( I 
inner membrane protein | 1,5 | 


266 


U14103 < 


Mus musculus RGL 
protein mRNA, 
:omplete cds. 


I i 

0.60 1 4099845 i 


[U90533) serine protease 1 [ 
nhibitor [Streptomyces fradiael | 0.098 | 


267 


] 
1 

U95094 


'Cc no pus taevis XL- 
(NCENP(XL- 
NCENP) mRNA, 
'omplete cds 


{ 

0.59 j 3282851 I 


AF047897) ankyrin-like protein 1 
IGE-ANK 'Ehrlichia sp. BDS] 5.5 1 


268 


f 
9 
1 

AE000872 p 


detnanobactenum 
lermoautotrophicum 
rom bases 896604 to 
12784 (section 78 of 
48) of the complete 
cnome 


f 
P 

0.59 1 401553 I? 


IYPOTHETICAL 24.5 KD 1 
ROTEIN IN NADB-SRMB 1 
MTERGENIC REGION | 4.3 | 



|Cf7 



WO 01/02568 



PCT7US00/18374 



Nearest Neighbor (BlastN vs, Genbank) 



Nearest Neighbor (BlastX vs. Non-Redundant Protein^ 



SEQ 

ID I ACCESSION 



DESCRIPTION I P VALUE ACPF^mM 



269 | LI 1871 



270 [ AF017114 



271 I AF027807 



272 | U81787 



Gallus gallus achaete 
scute homologue 
(ASH) mRNA, 
complete cds. 



Oryctolagus 
cuniculus glycogen 
synthase mRNA, 
complete cds 



0.59 



628110 



273 | U76036 



274 



AB014564 



AF044171 



Homo sapiens beta- 
casein (CSN2) gene 
complete cds 

Human WntlOB 
mRNA. complete c ds 
Apteryx australis 1 
ribosoma! RNA gene, 
mitochondrial gene 
for mitochondrial 
RNA, partial 
sequence 



Homo sapiens mRNA 
for KIAA0664 
protein, partial cds 



iiomo sapiens cyclin- 
dependent kinase 
nhibitor 2D 
(CDKN2D) gene, 
martial cds 



0.59 



728856 



0.59 



0.59 



3252932 



3875538 



0.59 



4193356 



0.59 



1709851 



DESCRIPTION 



iiypuuieutiii pi uiein - iiiiukui ■ 



TT e ip^vlnis 4 leading I r affle I " 
[Human herpesvirus 4] 2 
[Human herpesvirus 4] 
>gi|1334838|gnl|PID|e25079 4 
[Human herpesvirus 4] 
>gi|I334840|gnI|PID|e25081 6 
'Human herpesvirus 4] 
>gi|1334842|gnJ[PJD[e25067 8 
[Human herpesvirus 4] 
>gi|1334844|gnI|PID|e25069 10 
'Hum an, herpesvirus 4] 
>gi|I334846[gnI|PID|c25071 12 
[Human h erp esvirus 41 

NITROGEN ASE IRON-IRON 
PROTEIN ALPHA CHAIN 
(NITROGEN ASE 
COMPONENT I) 
(DINITROGENASE) capsulatus] 
>gi|312238 (X70O33) 
alternative nitroeenase 



(AF067155) truncated rev 
protein [Human 
immunodeficiency virus t 

(267990) similar to cuticle 
collasen 



1] 



0.59 



3925213 



(AFO55088) ATP-binding 
cassette; PsaB [Streptococcus 

pneumoniae] 

htf- ASSOCIATED iPUCINCft 
FACTOR (PSF) long form - 
human >gi|38458 (X70944) 
PTB-associated splicing factor 
Homo sapiens] 



P VALUE 



4.2 



2.4 



1.5 



1.4 



0.83 



0.17 



(AL032626) Y37D8A.17 
Oaenorhabditis eleeansl 



3e-10 



WO 01/02568 



PCT/US00/18374 



Nearest Neighbor (BlastN vs. GenbanJc) 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



SEQ 
ID 



ACCESSION 



DESCRIPTION 



P VALUE 



ACCESSION 



DESCRIPTION 



P VALUE! 



276 



LI9640 



Saccharomyces 
cerevisiae cdc2/cdc28 
related protein kinase 
;ene, complete cds. 



277 



280999 



278 



YII108 



279 



U80001 



uman DNA 
sequence from 
cosmidE140G5 on 
chromosome 22, 
complete sequence 
Homo sapiens] 



0.59 



(Z81130) T23GII.9 
3880115 [Caenorhabditis elegans] 



H.sapiens WNT8B 
gene 



Sphyraena idiastes 
lactate dehydrogenase 
A 



0.58 



<NONE> 



<NONE> 



0.58 



<NONE> 



<NONE> 



0.58 



<NONE> 



<NONE> 



le-21 



<NONE>| 



<NONE> 



280 



249637 



S.cerevisiae 
chromosome X 
reading frame ORF 
YJR137c 



0.58 



<NONE> 



<NONE> 



281 



X64467 



H.sapiens ALAD 
gene for 

porphobilinogen 
synthase 



0.58 



<NONE> 



<NONE> 



282 



X74506 



G.gallus hox B3 
mRNA 



0.58 



283 



U68040 



Cochliobolus 
heterostrophus 
)Qlyketide synthase 



<NONE> 



<NONE> 



0.58 



<NONE> 



<NONE> 



<NONE> 



284 AF089084 



Arabidopsis thaliana 
putative auxin efflux 
carrier protein (PLN1) 
mRNA. complete cds 



0.58 



<N0NE> 



<NONE> 



<NONE> 



285 



U38481 



Rattus norvegicus 
ROK -alpha mRNA, 
complete cds 



0.58 



<NONE> 



<NONE> 



286 



AF017656 



Homo sapiens G 
protein beta 5 subunit 
mRNA. complete cds 



0.58 



3236249 



(AC004684) hypothetical 
protein [Arabidopsis thaliana] 



287 



M96234 



Human glutathione 
transferase class mu 
number 4 . 



0.58 



1280073 



(U55366) Similar to cuticle 
collagen [Caenorhabditis 
elegans] 



288 



AB002339 



Human mRNA for 
KIAA0341 gene, 
partial cds 



0.58 



861293 



(U28741)F35D2.1gcnc 
product [Caenorhabditis 
elegans] 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 



Nearest Neighbor ( BlastN vs. Genbank) 



ACCESSION 



DESCRIPTION 



P VALUE 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



DESCRIPTION 



P VALUE 



289 



290 



291 



292 



U11295 



Neisseria 
gonorrhoeae 
carbamoyl phosphate 
synthetase 
(giutamine) small 
subunit (carA) and 
large subunit (carB) 
genes, complete cds. 



0.58 



D80001 



Human mRNA for 
KIAA0179 gene, 
partial cds 



2425135 



0.58 



4097223 



Z11700 



Escherichia coli 
genes facG, faeH, 
fael, faeJ and IS629- 
like insenion 
sequence. > :: 
emb|2U7l0|ECFAE 
HIJ E.coti faeH, fael 
and faeJ genes 
encoding FaeH, Fael 
and FaeJ proteins 



0.58 



M77350 



Mouse hair keratin 
A I (MHKAl)gene, 
complete cds. 



0.58 



2347188 



141165 



(AF020283) DG2044 gene 
product (Dicryostelium 
discoideum] 



(U49836) gamma-glutamyl 
transpeptidase precursor [Brugia 
malayi] 



5.3 



(AC002338) laccase isolog 
Arabidopsis thaliana] thaliana] 



HYPOTHETICAL 8.3 KD 
PROTEIN >gi [62179 



3.2 



293 



X63787 



thermophila gene 
for snRNA U3-2 



0.58 



2826900 



(AB004461) DNA polymerase 
alpha catalytic subunit [Oryza 
sativa] 



294 



295 



D63881 



U39378 



Human mRNA for 
KIAA0160 gene, 

> artiai cds 

lymnocarena 



G 



0.58 



1934730 



(U95036) germin-likc protein 
'Arabidopsis thaliana] 



mexicana 16S 
ribosomal RNA gene, 
mitochondrial gene 
encoding 

mitochondrial RNA, 
partial sequence 



0.58 



2194131 



(AC002062) Similar to 
Synechocystis antiviral protein 



3.1 



296 



X87987 



P.pastoris PRC1 gene 

> :: 

dbj|EI2103|El2103 
DNA encoding 
precursor of protease 
from Pichia pastoris 



0.58 



3914197 



OCCLUDIN >gi|1276983 
(U49221)occludin [Canis 
familiaris] 

gi|1589l8ljprf]|2210347D 
occiudin [Canis familiaris] 



WO 01/02568 



PCTYUS00/18374 



Nearest Neighbor CBIastN vs. Gcnbank) 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



SEQ 
CD 



ACCESSION 



DESCRIPTION 



P VALUE 



ACCESSION 



DESCRIPTION 



P VALUE! 



297 



X' ? 5782 



A.thaliana (L.Heynh.) 
chloroplast raRNA 
for recombinant APS 
kinase 



298 



M64848 



Mouse platelet- 
derived growth factor 
B chain musculus 
platelet-derived 
growth factor beta- 
chain (sis) gene, exon 



0.58 



1732444 



(D38529) DRPLA protein 
[Homo sapiens] 



2.4 



299 



AE001460 



Helicobacter pylori, 
strain J99 section 21 
of 132 of the 
complete genome 



0.58 



3025832 



(AF055985) pyrrolidone-rich 
antigen [Onchocerca volvulus 1 



1.4 



0.58 



2827198 



300 



X65720 



M.musculus gene for 
protein kinase C- 
gamma (exonl and 
exon 2) 



0.58 



418395 



(AF037454) ubiquitin protein 
lipase [Mus musculus] 



LUlH PkU ' l ' blN 
>gi|320737|pir||S30818 
hypothetical protein YER164w - 
yeast (Saccharomyces 
cerevisiae) >gi|603404 
(U1S917) Chdlp: transcriptional 
regulator [Saccharomyces 
cerevisiae] 



I.I 



301 



AF043130 



Arabidopsis thaliana 
lactate dehydrogenase 



Human genes for 
collagen type IV 
alpha 5 and 6, exon 
and exon 1' 



0.58 



3024637 



SEX- DETERMINING 
REGION Y PROTEIN 
determining protein [Mus 



0.62 



302 



D28116 



0.58 



1458250 



(U64835) T09D3.3 
Caenorhabditis elegans] 



0.36 



303 



AE001Q75 



Archaeoglobus 
fulgidus section 32 of 
172 of the complete 
genome 



0.58 



2276333 



(Z97991) hypothetical protein 
Rv0336 



0.36 



304 



AF003948 



I 305 



U 10692 



Rhodococcus opacus 
chloromuconate 
cycloisomerase 
transposase homo log 
genes, complete cds 



Human MAGE-7 
antigen (MAGE7) 
pseudogene, complete 
cds. 



0.58 



477072 



mucin 7 precursor, salivary ■ 
human 



0.28 



0.58 



3287858 



HOMEOBOX PROTEIN HOX- 
11 



0.054 



WO 01/02568 
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Nearest Neighbor (BlastN vs. Genbank) 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



SEQ 
ID 



ACCESSION 



DESCRIPTION 



P VALUE 



ACCESSION 



DESCRIPTION 



I P VALUE 



Rhodococcus opacus 
chloromuconate 
cycloisomerase 
transposase homolog 
306 I AF003948 genes, complete cds 



307 I X99350 



H.sapiens HFH4 
gene, exon I and 
joined CDS 



0.58 



3551821 



(AF058803) mucin 4 [Homo 
sapiens] 



0.58 



137483 



VAV PROTO-ONCOGENE 
>gi|55221 (X64361) proto- 
loncogene [Mus musculus] 



0.041 



0.024 



308 | AJ234282 



Homo sapiens mRNA 
for Ig heavy chain 
variable region, clone 



309 | AF079310 



Mus musculus histone 
deacetylase 3 
(Hdac3) gene, exons 
4 through 15 and 
complete cds 



0.58 



(AC003682) R27945.2 [Homo 
3264846 [sapiens] | 0.018 



(U66220) unknown 
0.58 | 1657601 [Nannocystis exedens] 



0.014 



310 1 AF019367 



Human thiopurine 
methyl transferase 
(TPMT) gene, exons 
6 and 7 



0.58 



3283352 



(AFO63020) lens epithelium- 
derived growth factor [Homo 
sapiens] 



0.011 



31l|X65720 



M.musculus gene for 
protein kinase C- 
gamma (exonl and 
exon 2) 



0.58 



1790878 



(U38291) microtubule- 
associated protein la [Homo 
Jsapiens] 



0.008 



312 | AB0U155 



Homo sapiens mRNA 
for KIAA0583 
jrotein. partial cds 



SYNAPSINS IA AND IB 
0.58 | 1351166 >gi|163713 



0.006 



313 1 X63692 



H.sapiens mRNA for 
DNA 



0.58 



1817548 



(D84307) phosphoethanolamine | 
cytidylyltransferase [Homo 
jsapiens] | 0.00 1 



314 | U53746 



eline 

immunodeficiency 
virus isolate FIV- 
Pco336-8 pol 
polyprotein (pol) 
gene, partial cds 



0.58 



315 I K0O436 



Rattus norvegicus 
(clone rtl-l) pseudo- 
GIv-tRNA eene. 



2246532 



(U93872) ORF 73, contains 
I large complex repeat CR 73 



2e-05 



0.58 



206712 



(M64793) salivary proline-rich 
[proiein [Rattus norvegicus] 



le-05 



( !^ 



WO 01/02568 



PCT/US00/18374 





& Neares 


t Neighbor (BlastN vs, Genbank) 


Nearest Neighbor (BlastX vs. Non-RednnHnm xw«;„^ 1 


SEC 
ID 


ACCESSIO 


N DESCRIPTION 


P VALUI 


: ACCESSION 


DESCRIPTION 


p value! 


316 

317 
318 


S79632 
AB007918 


HSH=heat shock" 
factor 2 {aJiernativei 
spliced, splice 
junction region) 
[mice, CBA/J, testis, 
Genomic, 120 nt. 
segment 2 of 3] 

Rat liver mRNA for 
Kan-1, complete cds 

Homo sapiens mRNA 
for KIAA0449 
protein, partial cds 


y 

0.58 

0.58 
0.58 


lynJtizfyQ) tDtTl protein 
_ 4038594 fLvcopersicon esculentuml 
|(U3i376j coded tor by C. 
leJe^ans cDMA rnOi^fi- r><-u***A 
Ifor by C. elegans cDNA 
Icm01c2; similar to melibiose 
(carrier protein 

1 ( th io methyl gal actoside permease 
1280135 II) 

FACTOR RECEPTOR 
KINASE SUBSTRATE EPS 8 
>gi|530823 (U 12535) epidermal 
growth factor receptor kinase 
2833239 Isubstrate [Homo sapiensl 


3e-06 1 
i 

le-08 
3e 13 1 


319 


AB001466 


Homo sapiens mRNA 
for Efsl, complete 
cds 


0.58 


\\u**j\j£i) sj KUa trypsin 
2943716 inhibitor [Homo sapiensl 


2e-14 I 


320 




Saccharomyces 
cerevisiac CRE1 gene 
for putative protein 
kinase. 


0.58 


(Z81130) T23G11.9 
3880115 ICaenorhabditis elecans) 


9e-21 J 


321 
322 

323 


Z49535 
M625Q6 

F 

X05944 s 


S.cerevisiae 
chromosome X 
reading frame ORP 

YJR035w 

S.cerevisiae DBF20 
gene, complete cds. 
JTeast PSS gene for 
)hosphatidylserine 
ynthetase 


0.58 
0.57 

0.57 


(Z838l9)dJ146H21.2 (similar " 
to CYTOCHROME B-245 
HEAVY CHAIN) [Homo 
4106562 sapiensl 

<NONE> <NONE> ' 

<NONE> <NONE> 


3e -33 J 
<NONE>l 


324 


j 
r 

D38536 c 


mail gene for ADP- 
ibosyl cyclase, 
omplete cds 


0.57 


<NONE> <NONE> « 


:NONE> 


,325 
326 


S 
c 
r< 

275004 

(s 
fr 

L77034 se 


.cerevisiae " 
hromosome XV 
sading frame ORP 

r OR096w 

omo sapiens "~ 
ubclone 10_el0 
omPl Hi 6) DNA 
quence. 


0.57 
0.57 


<NONE> <NONE> < 

<NONE> 1 <NONE> < 


NONE>[ 
NONE>| 



75 



WO 01/02568 



PCT7US00/18374 



f 



Nearest Neighbor (BlastN vs. Genbank) 



SEQ 

CD I ACCESSION 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



327 1 D37887 



,328 | AB014562 



DESCRIPTION I P VALUE I ACCESSION 



Cyprinus carpio c- 



DESCRIPTION 



myc gene for c-Myc, 
complete cds 



0.57 



<NONE> 



<NONE> 



Homo sapiens mRNA| 
for KIAA0662 

irotein, pa nial cds | 0.57 



protein, parti 
Human DNA 



(M57576) Ig kappa chain [Mus 
197406 Imusculusl 



P VALUE 



<NONE> 



329 I Z69651 



uman DNA 
sequence from 
cosmid L75B9, 
Huntington's Disease i 
Region, chromosome | 
4pI6.3 



0.57 



1079280 



Jchaperonin containing TCP- 1 
complex gamma chain - African 
clawed frog >gi|793886 . 

[(X84990) Cctg 



330 | D89285 



Mesocricetus auratus 
mRNA for inter-alpha| 
trypsin inhibitor 
heavy chain 1, 

complete cds | 0.57 



331 1 Z48951 



S.cercvisiac 
chromosome XVI 
cosmid 9723 | 0.57 



RYANODINE RECEPTOR, 
134132 (SKELETAL MUSCLE 



(AJ 130783) APC2 protein [Mus 
j42 10432 [musculus] 



8.9 



6.9 



332 1 X95573 



A.thaliana mRNA for | 
salt-tolerance zinc 
finger protein | 0.57 



1174828 



TYROSINE 
DECARBOXYLASE 2 
'4.1.1.25)- parsley >gi| 169671 
(M96070) tyrosine 
decarboxylase [Petroselinurn 



333 | U95094 



Xenopus laevis XL- 
INCENP (XL- 
INCENP) mRNA, 
mplete cds 



0.57 



465646 



PkOBAiSLr ABC 
TRANSPORTER ATP- 
BINDING PROTEIN IN 
NTRA/RPON 5'REGION 

|(ORFl) Azorhizobium 
caulinodans >gi|3 1 1388 

[(X69959) ORF1 



334 | AE0Q1116 



335 I Z34291 



Borrelia burgdorferi 
(section 2 of 70) of 
the complete genome 



0.57 



2314735 



( AE000653) Na+/H+ antiporter 
(nhaA) [Helicobacter pylori 
[266951 



R.norvegicus mRNA 
for putative chloride 
channel. 



0.57 



1350832 



DNA-DlkLC ILL) RNA 

POLYMERASE I SECOND 
LARGEST SUB UNIT (RNA 
POLYMERASE I SUB UNIT 2) 
chain RPA2 - Euplotes 
pctoearinaius (SGC9) 
J>gi|578407 octocarinatus] 



4.0 



4.0 



WO 01/02568 



PCT7US00/18374 



Nearest Neighbor (BlastN vs. Genbank) 



SEQ 
ID 



ACCESSION 



DESCRIPTION 



P VALUE 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



DESCRIPTION 



P VALUE 



336 



D88255 



Homo sapiens A30 
Vk germline gene, 
partial cds 



(Z81063) similar to Actinin-type 



0.57 



3875983 



actin-binding domain containing 
proteins [Caenorhabditis 
elegans] 



337 



AF037261 



Homo sapiens SH3- 
containing adaptor 
molecule- 1 mRNA, 
complete cds 



0.57 



1397341 



338 



U26595 



Rattus norvegicus 
prostaglandin F2a 
receptor regulatory 
protein precursor, 
mRNA. complete cds 



0.57 



2773160 



(ufiiyjjj auuuai to wmam-iiKe 

protein; coded for by C elegans 
cDNA ykl84h5.3; coded for by 
C. elegans cDNA ykl84h5.5 
coded for by C. elegans cDNA 
ykl3d7.3; coded for by C. 
elegans cDNA ykl3d7.5; coded 
for by C. elegans cDNA 
yk3lel.5;co... >gi|3493541 
(AF057567) kinesin-like protein 
ZEN-4a [Caenorhabditis 
elegans] 



(AF039656) neuronal tissue- 
enriched acidic protein [Homo 
sapie ns] 



339 



X69903 



R.norvegicus mRNA 
for interleukin 4 
receptor 



0.57 



2649193 



(AE001009) quinone-reactive 
Ni/Fe-hydrogenase B-type 
cytochrome subunit (hydC) 
Archaeoglobus fulgidusl 



340 



Z74825 



S.cerevisiae 
chromosome XV 
reading frame ORE 
YOL083w 



0.57 



1458319 



(U64S46) F47D2.5 gene 
product [Caenorhabditis 
elegans] 



341 



AJ 13 1469 



Foot-and-mouth 
disease virus O vpl 
gene, strain O/A/58 



342 



AF011360 



Mus musculus 
regulator of G-protein 
signaling 7 (RGS7) 
mRNA. complete cds 



343 



AF011360 



Mus musculus 
regulator of G-protein 
signaling 7 (RGS7) 
mRNA. complete cds 



0.57 



91206 



proline-rich protein - mouse 
(fragment) musculus] 



0.57 



542514 



gelsolin - American lobster 



0.57 



1078946 



gelsolin - American lobster 

gi|4523I3 gelsolin [Homarus 
americanus] 



0.80 



us 



WO 01/02568 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non- Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















344 


L39210 


Homo sapiens inosine 
monophosphate 
dehydrogenase type E 
gene, complete cds 


[ 

0.57 


559526 


(X77466) 98.8kD polyprotein 
[Strawberry latent ringspot 
virus] 


0.79 


345 


U81523 


Human endometriaJ 
bleeding associated 
factor mRNA, 
complete cds 


0.57 


211499 


(KOI 702) HMW/LMW collagen 
subunit precursor [Gallus gallus] 


0,79 


346 


U4656I 


Tetrahymena 
thermophila 
poiyuDiqmtin (i i uj; 
gene, complete cds, 
and RNA polymerase 
11 subunit 2 (RPB2) 
gene, partial cds 


0.57 


2506493 


HYPOTHETICAL 100.5 KD 
PROTEIN IN IAP-CYSH 
INTERGENIC REGION 
>gi|882654 (U29579) alternate 
gene name ygcB; ORF_f888 
[Escherichia coli] >ei|l789119 


0.60 


347 


X95543 


C.japonica mRNA for 
legumin (clone 
CjLeg31) 


0.57 


1709261 


NbbKOFILAMESlT TRIPLET 
M PROTEIN (160 KD 
NEUROFILAMENT 
PROTEIN) (NF-M) 
>gi|l083164|pir||S55395 
neurofilament protein M - rabbit 
(fragment) >ei|854353 


0.46 


348 


Y17282 


Homo sapiens mRNA 
for cytokeratin type II 


0.57 


3044086 


(AF055904) unknown 
\f yxococcus xanthus] 


0.45 


349 


XQG716 


Frog mRNA fragment 
for alpha- A2- 
crystallin 


0.57 


3406654 


(AF079369) transcriptional 
repressor TUP1 [Dictyostelium 
discoideum] 


0.20 


350 


X53238 


Klebsiella sp. 
bacteriophage Kll 
gene 1 for RNA 

nnlvm^rncp 
\J*Jl YUiCI iuC 


U.J i 




[Z4691 j) polyketide synthase 


0.16 


351 


X99012 


H.sapiens FUS gene, 
exon 12 


0.57 


fi 243898 


(S78897) GOR=antigenic 
epitope [chimpanzees, Peptide, 
427 aal [Pan] 


0.090 


352 


AL008711 


Human DNA 
sequence from PAC 
390N22 on 
chromosome Xp22.2 


0.57 


( 

1469545 i 


U53585) fibronectin attachment 
protein [Mvcobacterium avium] 


0.053 


353 


1 

S74506 


SOX9 [human, fetal 
urain. Genomic, 1494 
it. segment 3 of 3] 


0.57 


< 
t 

c 

1326350 I 


U5S748) similar to potential 
ransmembrane domains in S. 
rerevisiae nulcear division 
*FT1 protein (SP:P38206) 


0.017 



)7<f 



WO 01/02568 



PCT7US00/18374 





; Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neiehbor (BlastX vs. Non- Redundant Proteins) 


SEQ 
ED 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















354 


D25542 


Human mRNA for 
golgi antigen gcp372, 
complete cds 


0.57 


4063399 


(AF 1025 75) cell surface protein 
DTFA [Dicryostelium 
discoideum) 


0.005 


355 


AB0I5426 


Mus musculus mRNj* 
for alpha 1,3- 
fucosyltransferase IX 
complete cds 


0.57 


2661842 


(Y 15732) DNA polymerase beta 
[Xenopus Iaevis] 


i 

7e-ll 


356 


X51394 


Xenopus mRNA for 
APEG protein, 
containing a highly 
repetitive amino acid 
sequence 


0.57 


1929056 


(Y12090) putative 3,4- 
dihydroxy-2-butanone kinase 
[Lycopersicon esculentum) 


9e-l2 


357 


AB0079I8 


Homo sapiens mRNA 
for KIAA0449 
protein, partial cds 


0.57 


2833239 


EPIDERMAL GROWTH 
FACTOR RECEPTOR 
KINASE SUBSTRATE EPS 8 
>gi|530823 (U12535) epidermal 
growth factor receptor kinase 
substrate [Homo sapiens] 


3e-13 


358 


AB001466 


Homo sapiens mRNA 
forEfsl, complete 
cds 


0.57 


2943716 


(D45027) 25 kDa trypsin 
inhibitor [Homo sapiens] 


2e-14 


359 


Y00760 


Rabbit mRNA for 
adult fast skeletal 
troponin-C 


0.57 


2576348 


(AC002400) Glutamyl tRNA 
svnthetase [Homo sapiens] 


2e-28 


360 


X95153 


Rsapiens brca2 gene 
exon 3 > :: 

emb|A62778|A62778 
Sequence 19 from 
Patent WO9719U0 


0.57 


3419847 


( AC004982) similar to yeast 
hypothetical protein ybk4; 
Similar to P38164 
fPID:g586461) [Homo sapiens] 


2e-55 


361 


X85967 


B. vulgaris mRNA for 
betavulgin 


0.56 


<NONE> 


<NONE> 


<NONE> 


362 


U09251 


Mycoplasma 
genital ium DNA 
gyrase subunit B 
complete cds, DNA 

^n! vtyipm TTT K^fa 

^LH jrlllCi OOC lit VJCla 

subunit (dnaN) and 
seryl-tRNA 
synthetase (serS) 
eenes. partial cds. 


0.56 


<NONE> 


<NONE> 


<NONE> 


363 


< 
< 

i 

V00158 i 


Chloroplast Euglena 
gracilis genes coding 
for transfer RNAs 
ipecific for threonine, 
glycine, methionine, 
>erine and glutamine. 


0.56 


<NONE> 


<NONE> 


<NONE> 



in 



WO 01/02568 



PCTYUS00/18374 





Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Clostridium 










' 364 


D88151 


perfringens DNA for 
D-alanine:D-aIanine 
ligase, cortical 
fragment- lytic 
enzyme 


0.56 


<NONE> 


<NONE> 


<NONE> 


365 


U67478 


Methanococcus 
jannaschii section 20 
of 150 of the 
complete genome 


0.56 


<NONE> 


<NONE> 


<NONE> 


366 


L23800 


Tachyglossus 
aculeatus beta-g!obin 
homolog (HBB) 
gene, complete cds 


0.56 


<NONE> 


<NONE> 


<NONE> 


367 


AB0U129 


Homo sapiens mRNA 
for KIAA0557 
protein, panial cds 


0.56 


<NONE> 


<NONE> 


<NONE> 


368 


L77034 


Homo sapiens 
(subclone 10_eI0 
from PI H16) DNA 
sequence. 


0.56 


<NONE> 


<NONE> 


<NONE> 


369 


Z47202 


C.albicans gene for 
TFIIIB (BRF1) 
subunit. 


0.56 


<NONE> 


<NONE> 


<NONE> 


370 


U53868 


Clostridium 
acetobutylicum 
mannitol-specific 
phosphotransferase 
system (PTS) system, 
mtlA, mtlR. mtlF, and 
mtlD genes* complete 
cds 


0.56 


' <NONE> 


<NONE> 


<NONE> 


371 


AF041259 


Homo sapiens breast 
cancer putative 
transcription factor 
(ZABCI) mRNA, 
complete cds 


0.56 


<NONE> 


<NONE> 


<NONE> 


372 


L42636 


Plasmodium 
falciparum variant- 
specific surface 
protein (var-7) 
mRNA. complete cds. 


0.56 


2213557 


(Z97052) hypothetical protein 


8.8 



WO 01/02568 



PCT7US00/18374 



SI 

1 SEQ 
ID 


Neares 
ACCESSIO 


LNcighbor (BlastN vs. 
N DESCRIPTION 


Genbank) 
P VALUE 


Nearest Neighbor (BlastX vs. Non-Redundant 
ACCESSION DESCR rPTTON 


Proteins) | 

p value! 


373 


U96180 


Human protein 
tyrosine phosphatase 
(TEPI) mRNA, 

comnletr rrta 


j THIOREDOXIN REDUCTAS 
1 Ithioredoxin reductase (NADPF 
0.56 731016 fCoxiella burnetii! 


E j 

0 1 

8.7 1 


374 
375 


L76259 
AF045946 


Homo sapiens PTS 
_ gene, complete cds 
Mus muscufus 
D16Jhul7 YAC 
98B3 acentric end, 
partial sequence 


(Y12225)Spi-l/PU.l 
2i§ J 2369863 transcription factor 

I hypothetical protein - common 
1 sunflower protein [Helianthus 
0.56 2130017 annuusl 


6.7 J 
5.1 1 


1 376 I X97986 


M.musculus mRNA 
for desmocollin type 
1 


I (AC005936) hypothetical 
056_ 1 4038031 protein UrahirWic fh^li^i 


3.9 1 


377 X79437 

378 1 M27902 


M.musculus whey 
acidic protein (WAP) 
gene exon 1 

Rat cardiac specific 
sodium channel alpha- 
subunit mRNA, 
complete cds. 


i>P!NULl: PULH BUITT ~ 

J COMPONENT SPC42 yeast 
J 1 (Saccharomyces cerevisiae) 
I >gi|486054 (228042) ORF 
1 |YKL042w [Saccharomyces 
J cerevisiae] >gi|666098 
J (X7I621) hypothetical 42.3 kD 
1 protein [Saccharomyces 
0-56 j 549670 cerevisiael 

1 ENDOGLUCANASE G 
J PRECURSOR 3.2. 1 .-) CelCCG 
I precursor - Clostridium 
~ j 585234 cellulolyticum eel lulolyti cum] 


3.9 1 
3.9 1 


II < 

1 1 * 

379 J AF036696 7 


Caenorhabditis 
:legans cos mid 
7 15B10 


1 gp70=envelope protein 
1 (endogenous pro virus ) host=cat 
1 lymphoid tissues, Peptide 445 
0.56 546071 aa] 


3.6 1 


1 C 
1 e 

B 

1 s 

1 [< 

380 1 299102 e 


"aenorhabditis 
Icgans cosmid 
0331, complete 
squence 
raenorhabditis 
egans] 


1 Kb 14iJi; putative reverse 

j transcriptase; QRF2; encodes aa 
1 motifs conserved in reverse 
1 transcriptases; most closely 
j (related reverse transcriptases are 
1 those of non-LTR 
1 retrotransposons. The 3' 901 bp 
I of this CDS are identical to the 
0.56 603664 3' 901 bp ... 


3.0 I 


1 E 

1 T 

381 1 L27850 |n 


quus caballus (clone 
131) T-cell receptor 
SA. V-resion. 


-2^1 1 1079150 |transcription factor shn - fruit fly 


1.7 j 



{?<? 



WO 01/02568 



PCT/US00/18374 



Nearest Neighbor (BlastN vs. Genbank) 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



SEQ 
ID 



ACCESSION 



DESCRIPTION 



P VALUE 



ACCESSION 



DESCRIPTION 



p value! 



HYPOTHETICAL 113.1 KD 



X97986 



M.musculus mRNA 
for desmocollin type 
1 



0.56 



AF087455 



Didelphis virginiana 
G protein receptor 
kinase 2 mRNA, 
complete cds 



2497227 



PROTEIN IN PRE5-FET4 
INTERGENIC REGION 
>gi|1072409 (254141) unknown 



0.56 



1213453 



(U12964) contains anlcyrin-like 
repeats; similar to human 
desmoplakin repeat region 
[Caenorhabditis elegansl 



1.7 



1.3 



D80011 



Human mRNA for 
KIAA0189 gene, 
complete cds 



0.56 



AJ002272 



L39210 



Mus musculus mRNA 
forHAPi-A protein, 
3' region 



226535 



Homo sapiens inosine 
monophosphate 
dehydrogenase type II 
gene, complete cds 



protease [Hepatitis B virus] 



LI 



0.56 



3327158 



0.56 



628431 



(AB014572) KIAA0672 protein 
[Homo sapiens] 



coat protein - strawberry latent 
ringspot virus 



1.0 



0.77 



X02770 



Mouse Thy- 1.2 gene 
5' untranslated region 
andexon 1 



0.56 



3327046 



(AB014516) KIAA0616 protein 
fHomo sapiens] 



0.59 



AF038575 



Schizos accharomyccs 
pombe Wiskott- 
Aldrich Syndrome 
protein homolog 
(wspl+) gene, 
complete cds, and 
BTF3/beta-NAC 
gene, partial sequence 



0.56 



88466 



salivary proline-rich 
phosphoprotein precursor PRH1 
(allele PIF) - human >gi| 190484 
(K03203) prepro salivary 
proline-rich protein [Homo 
sapiens] >gi|190512 



0.35 



X56747 



Rat mRNA for fetal 
intestinal lactase- 
phlorizin hydrolase 
precursor, partial 



0.56 



Y 12072 



G.arboreum mRNA 
for famesyl 
pyrophosphate 
synthase 



2072742 



(248674) chitinase homologue 
Sesbania rostrata] 



0.23 



0.56 



296670 



(X07882) Po protein [Homo 
sapiens] 



0.20 



S75756 



p!5=cyclin D- 
dependent kinases 4 
and 6- binding 
protein/pl5 product 

exon/intron 1 ) 

human, brain tumors, 
Genomic. 753 nt] 



0.56 



1082743 



protein kinase (EC 2.7.1.37) 
SPRK - human sapiens] 
>gi| 1 09077 1 |prf||20 19437 A 
protein Tyr kinase I 



0.15- 



WO 01/02568 



PCT7US00/18374 



Nearest Neighbor (BlastN vs. GenbanJc) 



Nearest Neighbor (BlastX vs. No n- Redundant Proteins) 



SEQ 
ID 



ACCESSION 



DESCRIPTION 



Equus caballus rype 



P VALUE | ACCESSION 



DESCRIPTION 



p value! 



392 



U62528 



II collagen mRNA, 
complete cds 



0.56 



393 



X96877 



C.reinhardtii mRNA 
for unknown lumenal 
polypeptide 



461671 



[Segment 1 of 2] COLLAGEN 
.ALPHA WD CHAIN 



0.030 



0.56 



3341678 



(AC003672) putative zinc finger 
protein [Arabidopsis thaiianal 



5c-09 



394 



S78788 



cGATA-3 [chickens, 
liver, Genomic, 979 
nt, segment 4 of 4] 



395 



AF006640 



Drosophila 
melanogaster Ste20- 
like protein kinase 
mRNA. complete cds 



0.56 



2661590 



0.56 



1109830 



(AL009196) I- 

evidence=predicted by content; 
l-meihod=genefinder;084; 1- 
method_score=59.41; I- 
evidence_end; 2- 
evidence=predicted by match; 2 
match_accession=AA950019; 2- 
match_description=LD29959.5p 
rime LD Drosophila 
melanoaas... 



(U4I534) coded tor by C. 
elegans cDNA CEESI42F; 
Similar to helicases of 
SNF2/RAD54 family. 
Cacnorhabditi s elegans] 



2e-ll 



6e-12 



396 



AF006640 



397 



AE0007I6 



398 



236079 



Drosophila 
melanogaster Ste20 
like protein kinase 
mRNA, complete cds 



Aquifex aeolicus 
section 48 of 109 of 
the complete genome 



S.cerevisiae 
chromosome II 
reading frame ORF 
YBR2I0w 



0.56 



1109830 



0.56 



3688350 



0.55 



<NONE> 



(U41534) coded tor by C. 
elegans cDNA CEESI42F; 
Similar to helicases of 
SNF2/RAD54 family. 
[Caenorhabditis elegans] 



CAL03£)9$6jdJii8$B24.4 

(novel PUTATIVE protein 
similar to hypothetical proteins 
S. pombe C22F3.14C and C. 
elegans C 16A3.8) [Homo 
sapiens] 



4e-13 



3e-66 



<NONE> 



<NONE> 



399 



Y 17267 



Mus musculus mRNA 
for ubiquitin 
conjugating enzyme 



0.55 



400 



AC001461 



3omo sapiens 
(subclone 2_g5 from 
BACH107) DNA 
sequence 



<NONE> 



<NONE> 



<NONE> 



0.55 



<NONE> 



<NONE> 



<NONE> 



WO 01/02568 



PCT/US00/18374 



Nearest Nei g hbor (BlastN vs. Genbank) 



Nearest Neighbor (BlastX vs. Non- Redundant Proteins) 



SEQ 
ID 



ACCESSION 



DESCRIPTION 



P VALUE 



ACCESSION 



DESCRIPTION 



Alouatta seniculus 



401 | AFO 19079 



breast and ovarian 
susceptibility 
(BRCAl)gene, 
partial cds 



0.55 



402 I M90058 



Human serglycin 
gene, exons 1,2, and 



<NONE> 



<NONE> 



<NONE> 



0.55 



<NONE> 



<NONE> 



403 I AB013469 



Mus musculus CLM2 
gene for cytohesin 2, 
complete and partial 
cds, alternative 
plicing 



404 I AJ011592 



Bacteriophage PI ban 
gene 



0.55 



1729760 



0.55 



2493689 



(Z68152) chitinase [Gossypium 
hirsutuml 



PHOTOS YSTEM II 10 KD 
PHOSPHOPROTEIN deltoides] 

Si|2I43326|gnI|PID|e3 19090 
(YI3328) lOkDa 
phosphoprotein [Populus 
deltoides] 



8.6 



405 | Z15U8 



T.brucei kinetoplast 
maxicircle variable 
region DNA 



0.55 



2970432 



(AF049132JNADH 
dehydrogenase subunit 5 
Floromctra serratissimal 



406 | Z48951 



S.cerevisiae 
chromosome XVI 
cosmid 9723 



0.55 



4210432 



(AJ 1 30783) APC2 protein [Mus 
musculus] 



407 | U78726 



■lomo sapiens mad 
protein homolog 
Smad2 gene, 
promoter, exon la 
and exon I b 



0.55 



3319290 



(AF055994) thyroid hormone 
receptor- associated protein 
complex component TRAP220 
'Homo sapiens] 



408 | AG001389 



Homo sapiens 
genomic DNA. 21q 
region, clone: 
9HllBm42 



0.55 



125684 



KRUEPPEL PROTEIN 
>gi|72899|pir||TWFF Krueppel 
ap protein - fruit fly . 
(Drosophila sp.) melanogaster] 

gi|224875|prf||1202348A 
Krueppel gene 



409 1 M27640 



Plasmodium vivax 
major blood stage 
surface antigen gene. 
partial cds. 



0.55 



549453 



i-vmeppei ac ne 

"X-LlNKEb PEST- 
CONTAINING 
TRANSPORTER transporter - 
human >gi|458255 (U05321) X 
linked PEST-containing 
transporter [Homo sapiens) 



4.9 



3.8 



3.8 



WO 01/02568 



PCT/US00/18374 



Pip 



SEQ 
ED 



Nearest Neighbor (Bias tN vs. Cenbank) 



ACCESSION 



410 



411 



412 



D37977 



DESCRIPTION I P VA LUE 
Fugu rubripes mRNA 



for sodium channel 
alpha subunit, panial 
cds 



M88505 



Ostcrtagia osienagi 
cathepsin B-Iike 
cysteine protease 
[gene, panial cds. 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



DESCRIPTION 



0.55 



1435038 ( D38024) QRF fHomo sapiens) 



Xenopus laevis 
[mitotic 
phosphoprotcin 44 
U95098 ImRNA, partial cds 



0.55 



(AF000900) p45 [Rattus 
394 1 277 nor vegicus ) 



413 I U89241 



Human mibp gene, 
►artial cds 



IXi 



414 I AF027151 



415 I AF006821 



417 I U38307 



Lenopus laevts 
[survival of motor 
[neuron protein 
interacting protein 1 
(SIP 1) mRNA. 
{complete cds 
Bufo m annus 
natriuretic peptide 
(receptor C mRNA, 

partial cds 

ILactococcus lactis 
jcremoris plasm id 
pJW565 DNA. 
IlabiiM, llabiiR genes 
la nd orfX 

I us musculus 
[collagen alpha- 1 type 
jl gene, 5' flanking 
[region, panial 
[seq uence. 



0.55 



2570154 



0.55 



(U62253) 16kDa secretory 
4097465 protein [Sus scrofal 



418 I D13473 



(Mouse mRNA for 
|Rad51 protein 



_4I9 | AF045238 



IBungarus tasciatus " 
(acetylcholinesterase 
Igene. alternatively 
spliced products, 
lartial cds 



[ethanobactenum 
thermoautotrophicum 
from bases 1 to 
1020S (section I of 
J 48) of the complete 
genome 



0.55 



0.55 



0.55 



0.55 



4007790 



2245075 



3386334 



1362802 



[(AL034463) putative single- 
strand polynucleotide binding 

j protein [Schizosaccharomyces 
gombe] 



0.55 



1374698 



0.55 



3261734 



0.55 



186396 



gastric mucin - human 
(fragment) >si|5475I7 



(D83032) nuclear protein, 
NP220 [Homo sapiens] 



(Z94752) hypothetical protein 
Rv 1004c 



2.9 



(AB008376) 17-kDaPKC- 
potentiated inhibitory protein of I 
|PP1 fSus scrofal | 2.8 



2.2 



1.7 



[(Z97343) GTP-binding RAB2A 
protein | u 



(AF035120) type I procollagen 
[pro-alpha 2 chain [Can is 
Jfamiliarisl I 1.3 



1.3 



1.3 



0.99 



(M94131) mucin [Homo 
sapiens] 



0.97 



WO 01/02568 PCT/US00/18374 



SEQ 
ID 



_Nearest Neighbor (BlastN vs. Genbank) 



ACCESSION 



DESCRIPTION 



421 1 X99537 



422 I UQ8147 



423 | Z56586 



y.lipolytica SEC62 



:enc 

Aquilegia sp. 
phytochromc 
(PHYB/D) gene, 
partial cds. 



424 I U39442 



425 1 KQ2298 



426 | X84792 



427 I UQQ185 



429 | AF031650 



H.sapicns CpG DNA 
clone I2c8, reverse 
read cpg!2c8.r tld . 

Mus musculus 
glutamine:fructose-6- 
phosphate 
amidotransferase 
(GFAT) gene, 5' 
region and partial cds 



Rat chymotrypsin B 
(chyB) gene, 
complete cds. 



M.musculus clusterin 

ene 

apra aegagrus 
Saanen and Weisse 
Edel breeds DR beta 
chain antigen binding 
domain, MHC class II 
DRB 



Nearest Neighbor fBlastX vs. Non-Redundant Proteins! 




0.55 



0.55 



0.55 



0.55 



0.55 



H.sapiens CpG DNA, 
clone 178al2, reverse 
read cpg l78al2 .nl a 

Oryctolagus 
cuniculus anion 
exchanger 3 brain 
isoform (AE3) 
mRNA. complete cds 



430 | M25579 



Bovine adenylyl 
cyclase Type I 
mRNA. complete cds 



431 | Z48796 



H.sapiens Ski-W 
mRNA forhelicase 



0.55 



0.55 



0.55 



3876397 



2338024 



DESCRIPTION 



3320122 



0.55 



0.55 



282600 



3413810 



1652475 



2507136 



(U46007) espin [Rattus 
norvegicusl 



hypothetical protein - 
Mycoplasma hyorhinis 



(Y 17034) Bassoon [Mus 
musculus] 



SUBTILIN BIOSYNTHESIS 
PROTEIN SPAB 



807646 



1778210 



2649040 



330452 



(Ml 7294) unknown protein 
[Human herpesvirus 4] 



(U68412) fibrillar collagen 
I Arcnicola marina] 



(Z81068) F25H5.2 

[Caenorhabditis elega ns] | 0.58 



(AF005370) ribonucleotide- 
reductase, large subunit I 0.57 



0.43 



(D90905) hypothetical i protein 0.25 



0.19 



0.065 



0.044 



(AE000997) conserved 
hypothetical protein 

[Archaeoglobus fulgidus] [ 0.023 

(M 14708) DNA polymerase 
Human cytomegalovirus] - I 0.023 



WO 01/02568 



PCTYUS00/18374 



ggt%| Nearest Neighbor (BlastN vs. Genbank) 



SEQ 

S> I ACCESSION 



Nearest Neighbor (BlastX vs. Non-Redundam Proteins) 




WO 01/02568 



PCT/USOO/18374 




WO 01/02568 



PCT/US00/18374 





1 Nearest Neiehhnr rRIncfN uc fw-nk-^Ui 




SEC 
ID 


} 

ACCESSION DESCRIPTION 


P VALUI 


nearest iNeigr 

: ACCESSION 


ippr (BlastX vs. Non-Redundant 1 
DESCRIPTION 


proteins) J 

p value! 


447 


IRattus norvegicus Q- 
L 1 60 1 3 like gene sequence 


0.54 


3087760 


(AJ0O5583) p75 protein 
[Crypthecodinium cohnii] 


0.95 J 


448 
449 


Capra hircus skeletal 
1 muscle voltage-gated 
J chloride channel 
gClC-lmRNA, 
U60275 partial cds 

Myxococcus xanthus 
ino/UJL u-antigen 
biosynthesis opcron, 
rfbA, rfbB, and rfbC 
U36795 genes, complete cds. 


0.54 
0.54 


1781344 
3877232 


(Y10438) FK506 polyketide 
synthase 

(Z81540) predicted using 
Genefinder 


0.95 J 
0.74 


450 


Drosophila 
Imelanogaster eyelid 
(eld) mRNA, 
AF053091 complete cds 


0.54 


2144110 


zinc finger protein RIZ - rat 
>gi[949996 


0.14 J 


451 


Genome of the 
bacteriophage fd 
V00602 (Inoviridae). 


0.54 


2661620 


(AL009197) hypothetical 
protein 


0.11 1 


452 


Human semaphorin 
(CD100) mRNA, 
U60800 Icomplete cds 


0.54 


125682 


KkKAllN, UL1KAHJGH- 
SULFUR MATRIX PROTEIN 
(UHS KERATIN) 
>gi|109i I6|pir||A36686 ultra- 
high-sulfur keratin - sheep 
>ai|1306 (X55294) ultra high- 
sulphur keratin protein [Ovis 
iries] 


0.003 J 


453 


S.coelicolor secD, 
_ X85969 secF & apt genes 


0.54 


f 

c 
c 
E 

3874972 2 


ZQQ70Q^ similar \r\ rinnn>>»in« 

o 11 iii i tii to elongation 
actor Tu family (contains 
KTP/GTP binding P-Ioop); 
DNA EST EMBL:D76223 
omes from this gene; cDNA 
1ST yk478c5.5 comes from this 
ene fCaenorhabditis elesans] 


7e-06 


454 


H.sapiens mRNA for 
DAN26 protein, 
Y08265 [partial 


0.54 


<; 

r< 
tr 

3875131 fC 


270750) similar to vanadate 
rsistance protein 
ansmembranous domains 
raenorhabditis elegans] 


5e-12 1 



\%1 
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Nearest Neiehbor (BlastN vs. Gcnbank) 


Nearest Neiehbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


f DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 




riydromantes 










455 


uayoi j 


platycephalus 
cytochrome b (cytb) 
gene, mitochondrial 
gene encoding 
mitochondrial 
protein, partial cds 


0.53 


<NONE> 


<NONE> 


<NONE> 


456 


AF034597 


Habrobracon hebetor 
cytochrome oxidase 
II <?ene Dartial cdv 
and tRNA-Asp, tRNA 
His, and tRNA-Lys 
gcncb, complete 
sequence, 

mitochondrial genes 
for mitochondrial 
products 


0.53 


<NONE> 


<NONE> 


<NONE> 


457 


K02653 


Yeast (S.cerevisiae) 
tau repetitive element 
and Cvs-tRNA. 


0.53 


<NONE> 


<NONE> 


<NONE> 


458 


X53416 


Human mRNA for 
actin-binding protein 
(filamin) 


0.53 


2134839 


bullous pemphigoid antigen 2 - 
human 


6.2 


459 


< 
( 

M55545 < 


Drosophila 
subobscura aichohol 
dehydrogenase (Adh) 
gene, and aichohol 
dehydrogenase (Adh- 
iup) gene, complete 
:ds's. 


0.53 


t 

' 2136865 


lair keratin cysteine rich protein 
sheep 


2.1 



WO 01/02568 



PCT/US00/18374 





Nearest Neiehbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















460 


UL9362 


Mdiiianooanenum 

thcrmoautotrophicum 

methylene- 

tetrahydromethanopte 

rin dehydrogenase 

(mtd), 

imidazoleglycerol- 
phosphate 
dehydrogenase 
(hisB), and putative 
ferredoxin (fdxA) 
genes, complete cds, 

orfs ... 


0.53 


731969 


HYPOTHETICAL 91,6 KD 
PROTEIN IN HXT8-CRT1 
INTERGENIC REGION 
>gi|1078261|pir||S50773 
probable membrane protein 

(Saccharomyces cerevisiae) 
>gi|496950 (Z34098) ORF . 
[Saccharomyces cerevisiae] 

>gl|IUljjyo (Z4y4<>7; tJKr 

YJL212c 


0.54 


461 


AB011527 


mRNAfor MEGF1, 
complete cds 


0.53 


417037 


GERM CELL-LESS PROTEIN 
fruit fly (Drosophila 
melanogaster) >gi|157490 
(M97933) germ cell-less protein 
[Drosophila melanoeaster] 


3e-06 


462 


U643I3 


Bacillus firmus MsyB 
gene, 5* upstream 

ICglUIl allU pdTXlal CQ5 




<INUNfc> 


<NONE> 


<NONE> 


463 


AF008590 


Caenorhabditis 
etegans paraquat 
responsive protein 
(CePqM132) mRNA, 
complete cds 


0.52 


<NONE> 


<NONE> 


<NONE> 


464 


L 10245 


Mus saxicola 
spermidine/spermine 
N 1 -acety Itransferase 
(SSAT) gene, 
complete cds. 


0.52 


<NONE> 


<NONE> 


<NONE> 


465 


< 
< 

AF027173 < 


Arabidopsis thaliana 
;elIulose synthase 
:atalytic subunit (Ath- 
\) mRNA, complete 
:ds 


0.52 


: 

i 

124263 s 


INSULlN-LlKk GROWTH 
FACTOR IB PRECURSOR 
IGF- IB) (SOMATOMEDIN C) 
>gi|69361|ptr||IGHUlB insulin- 
ike growth factor IB precursor - 
luman prepropeptide [Homo 
apiens] 


7.7 




WO 01/02568 
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Nearest Neiehbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Cacnorhabdins 










466 


AL021066 


elegans cosmid 
H31B20, complete 
sequence 
[Caenorhabditis 
elegans] 


0.52 


2589162 


(D88451) aldehyde oxidase [Zea 
mays] 


6.0 


467 


AF038588 


Porphyra linearis 18S 
ribosomal RNA gene, 
3' partial sequence 


0.52 


1055055 


(UXHbO) coded for by C. 
elegans cDNA yk37gl.5; coded 
for by C. elegans cDNA 
yk5c9.5; coded for by C. 
elegans cDNA ykla9.5; 
alternatively spliced form of 
F52C9.8b 


4.6 


468 


AE001125 


Borrelia burgdorferi 
(section 1 1 of 70) of 
the complete genome 


0.52 


4115827 


(AB021287) polyprotein 
[Hepatitis G virus] 


2.0 


469 


AF006640 


Drosophila 
melanogaster Ste20* 
like protein kinase 
mRNA. complete cds 


0.52 


1 109830 


(U41534) coded for by C. 
elegans cDNA CEESI42F; 
Simitar to helicases of 
SNF2/RAD54 family. 
[Caenorhabditis elegans] 


0.002 


470 


U90177 


Aplysiacalifornica 
ubiquitin carboxyl- 
terminal hydrolase 
(Ap-uch) mRNA, 
complete cds 


0.51 


<NONE> 


<NONE> 


<NONE> 


471 


228304 


S.cerevisiae 
chromosome XI 
reading frame ORF 
YKR079c 


0.51 


<NONE> 


<NONE> 


<NONE> 


472 


Z92837 


Caenorhabditis 
elegans cosmid 
R03EI, complete 
sequence 
[Caenorhabditis 
elecans] 


0.51 


123506 


HYDROPHOBIC SEED 
PROTEIN (HPS) 


7.6 


473 


D 13803 


Mouse mRNA for 
RecA-like protein 
MmRadSL complete 
cds 


0.51 


3327228 


(AB014607) KIAA0707 protein 
[Homo sapiens] 


4.5 


474 


X07187 


Peahsp21 mRNA 


0.51 


3328678 


(AE001299) hypothetical 
protein [Chlamydia trachomatis] 


4.4 



WO 01/02568 
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SEQ 
ID 



Nearest Neighbor (BlasiN vs. Genbank) 



accession! description 



_475 I S63168 



476 | U67078 



CCAAl/ennancer- 



P VALUE 



Dmding protein 
deIta=transcription 
factor CRP3 homolog 
[human, prostate 
carcinoma cell line 
LNCaP, Genomic, 
1594 nt) 



Nearest Neii 
ACCESSION 



jhbor (BlastX vs. Non-Redundant Proteins) 



Xenopus laevis C2- 
HC rype zinc finger 
protein X-MyTl 
mRNA, complete cds 



477 1 L38933 



478 | AF001000 



479 I 228304 

480 I X97225 



481 | AJ00138S 



Homo sapiens GT198 
mRNA. complete 

ORE 

Lycopersicon 
esculentum 

lygalacturona se I 
S.cerevisiae 
chromosome XI 
reading frame ORE 
YKR079c 
Oncorhynchus keta 

IGF-II gene 

Homo Sapiens, RP58 
cDNA for complete 
mRNA 



0.51 



0.51 



0.51 



0.50 



0.50 



0.50 



0.50 



DESCRIPTION 



1653215 



3850320 



3219965 
<NONE> 

<NONE> 
<NONE> 

<NONE> 



(D9091 1) apolipoprotein N- 
acyltransferasc [Synechocystis 
sp-1 



(AF067520) PITSLRE protein 
kinase beta SV2 isoform [Homo 
sapiens] 



P VALUE 



1.2 



HYPOTHETICAL 100.4 KD 
TRP-ASP REPEATS 
CONTAINING PROTEIN 
C2C6.04C IN CHROMOSOME 
I 



<NONE> 

<NQNE> 
<NONE> 

<NONE> 



0.17 



0.059 
<NONE> | 

<NONE> 
<NONE>| 



WO 01/02568 



PCT/US00/18374 



SB 
j ID 


Nearest 

Q 

ACCESSION 


Neighbor (BlastN vs. 

DESCRIPTION 
Homo Sapiens, RP5 


Genbank) 

P VALU] 

8 " 


Nearest Neie 
z ACCESSION 


hbor (BlastX vs. Non-Redundant 
DESCRIPTION 


Proteins) 1 

p value] 


481 


AJ001388 


cDNA for complete 
mRNA 


0.50 


<NONE> 


<NONE> 




482 


M86626 


P.occultum 23S 
n bosom ai RNA, 
partial cds. 


0.50 


<NONE> 


<NONE> 


<NONE>| 


483 


U76523 


Sambucus nigra lecti 
precursor miuNA, 
complete cds 


n 

0.50 


1722856 


CHROMOSOME ASSEMBLY 

PROTEIN XCAP-E African 

clawed frog >gi|563814 

(U 13674) XCAP-E [Xenopus 

laevis] 


<NONE>| 
3.2 J 


484 


AF031663 


Mus musculus striatir 
mRNA, complete cds 


i 

0.50 


179521 


(M63730) BPAG2 [Homo 
sapiens] 


3.2 1 


485 


J 
i 

L 

U32729 c 


■laemophilus 
nfluenzae Rd section 
W of 163 of the 
omplete senome 


0.50 


3875699 


(Z92829)F10A3.15 
Caenorhabditis eleeansl 




486 


uictyosieiium 
discoideum clone 
9.10Tdd-3 and RED 
Jrepetitive elements, 
AF067198 partial sequence 


0.50 


2494740 


HYPOTHETICAL 28.3 KD 
PROTEIN IN GBD 5'REGION 
(ORF4) >gi|2120954|pir|)I39562 
ORF4 - Alcaiigenes eutrophus 
>gi|695274 (L36817) ORF4 


0.65 J 
0.008 1 


487 


Human interJcukin 4 
(IL-4) gene, complete 
M23442 cds. 


0.49 


<NONE> 


<NONE> 


<NONE>{ 


1 488 


ICaenorhabditis 
Jelegans POU 
Ihomeobox protein 
jCEH-IS(ceh-18) 
U16367 ImRNA, complete cds. 


0.47 


( 
t 
I 

3786409 r 


AF098499) contains similarity 
o Saccharomyces cerevisiae 
vIAFl protein (GB.U19492) 
Caenorhabditis ele^ans] 


8.9 I 


489 


Lycopersicon 
[esculenrum 
AF001000 polygalacturonase 1 


0.45 


<NONE> 


<NONE> 


cNONE> j 


490 


(Yersinia 

enterocolitica wbb 
218920 aene cluster 


0.41 


<NONE> 


<NONE> < 




1 491 


Human mRNA for 
KIAA0230 gene, 
D86983 partial cds 


0.35 


0 

206712 E 


^64793) salivary proline-rich 
*otein [Rattus norvejzicus] 


:NONE> J 
4e-05 j 


|_492 


Heliamhus tuberosus 
lectin 2 mRNA, 
AF064030 |completecds 


0.33 


<NONE> 


<NONE> < 


NONE>| 
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SEC 
ID 


1 Neares 
2 

ACCESSIO] 


t Neighbor (BlastN vs. 

X DESCRIPTION 
viireoscina sp. outei 


Genbank) 
P VALUE 

r 


Nearest Neigt 

: ACCESSION 


lbor (BlastX vs. Non-Redundant Proteins) 

DESCRIPTION |p VALUE 


493 


AFO67083 


membrane protein 
■ivniuiug gene, 
complete cds; Trp 
repressor binding 
protein gene, partial 
cds; and unknown 
genes 


0.33 


401553 


HYPOTHETICAL 24.5 KD 
PROTEIN IN NADB-SRMB 
INTERGENIC REGION 8 3 


494 


Y15520 


Papio hamadryas 
anubis gene encoding 
fertilin alpha-II 


0.29 


2408049 . 


(299164) hypothetical Drotein 1 1 i 


495 
496 


U33475 
D88356 


Alestes sp. 
ependymin mRNA, 

panial cds 

mouse Urt A for o- 
oxodGTPase, 
complete cds 


0.28 
0.22 


3913078 
<NONE> 


Ak i 1 L H V DkOCARBOR : 

RECEPTOR NUCLEAR 

TRANSLOCATOR 

HOMOLOG (DARNT) 

(TANGO PROTEIN) 

transcription factor [Drosophila 

melanogaster] 1 1.4 

<NONE> <NONE> 


497 
498 


U67603 


Methanococcus 
jannaschii section 145 
of 150 of the 
complete genome 

Malurus cyaneus 
microsatellite McyU2 


0.22 
0.22 


2209261 
992631 1 


(U5I222)p40[Streptomyces 
lalstedii] 1 g 3 

r U29131) Mg-chelatase subunit 
Synechocystis sp.] | 0.56 


l_4 

Ji 
_5C 


2L_ 

30 


< 

C 
r 

_Z49625_^ 
I 

d 

P 
k 

U64830 c 


S.cerevisiae 
:hromosome X 
eading frame ORF 

OR 125c 

)ictyostelium 
iscoideum AX2 
rote in tyrosine 
inase gene, complete 
ds. 


0.21 
0.21 


<NONE> 
<NONE> 


<NONE> <NONE> 




H 
si 

M24543 |g. 


[uman prostate- 
?ccific antigen (PA) 
?ne, complete cds. 


0,21 1 


0 

2764859 \\l 


<97918) gene 12.1 

bacteriophage SPP1) [ 6.0 
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SEQ 
ID 



Nearest Neighbor fBIastN vs. Genbank) 



ACCESSION DESCRIPTION 



502 



503 



504 



P VALUE 



X87618 



X7I591 



B.taurus mRNA for 
jthrombospondin 

(partial) 216 2 b] 
IB.taurus 

microsatellite 
[sequence INRA048 



X57808 



[Human germlinc 
immunoglobulin 
lambda light chain 
gene 



506 



507 



509 



510 



511 



Xenopus laevis 
mitotic 

phosphoprotein 44 
U95098 (mRNA. partial cds 



Mycobacterium 
fortuitum plasmid 
pJAZ38 replication 
(protein Rep (rep) 
U84216 gene, complete cds 



Nearest Neigh bor (BlasiX vs. Non-Redundant Proteins) 



ACCESSION 



DESCRIPTION 



uOOO'ib protein - 



P VALUE! 



0.21 



0.21 



0.21 



iRattus norvegicus 
Inonmuscle myosin 
Iheavy chain-A 
U3 1463 I mRNA, complete cds. 



Rabbit mRNA for 
aminopeptidase N 
_X5I508 l(partial) 

[Homo sapiens full 
length insert cDNA 
AF086476 clone ZDSSFI7 



AF077006 



[Helicobacter pylori 
plasmid pHPM186, 
[complete sequence 



X734S0 (E.gunnii CAD gene. 



0.21 



0.21 



0.21 



0.21 



0.20 



0.20 



0.20 



2146000 



1354453 



Mycobacterium tuberculosis 
tuberculosis] 

>gi|1694863|gnI|PID|e283373 
(ZS30I8) hypothetical protein 
Rv2968c [Mycobacterium 
tuberculosis 1 



3.5 



f U52830) orf [Homo sapiens] 2.7 



211915^ 



2497139 



procollagen type V alpha 2 - 
mouse >gi|309 181 
[fel 



2499087 



3880111 



630864 



<NONE> 



<NQNE> 
<NONE> 



HWlkfeHLAL ;S.8KD 

PROTEIN IN ABF2-CHL 12 

INTERGENIC REGION 

>gi|1078003|pir||S52835 

hypothetical protein YMR075w - 

yeast (Saccharomyces 

cerevisiae) >gi|763022 

(Z48952) unknown 

Saccharomyces cerevisiae] 
j DP- — — 

GLUCOSE:GLYCOPROTEIN 
GLUCOS YLTRANSFER ASE 
PRECURSOR (DUGT) 
giucosyltransferase - fruit fly 
(Drosophila sp.) 
giucosyltransferase precursor 
[Drosophila melano^asterl 



2.7 



2.0 



(28 1 130) predicted using 
Genefinder 



LRR47 protein - fruit fly 
(Drosophila melanogaster) 
gi|4 15947 (X75760) LRR47 
Drosophila melanoeasterl 



<NONE> 



<NONE> 



<NONE> 



0.003 



0.002 



Ie-06 



<NONE> 



<NONE> 



WO 01/02568 
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SEQ 
ID 



512 



513 



Nearest Neighbor TBIastN vs. Genbanlf \ 



ACCESSION 



514 



515 



X75036 



D90875 



268343 



X62486 



DESCRI PTION 
T.acstivurn 
mitochondrial nad7 
gene for NADH 
dehydrogenase 

subunit 7 

Ecoli genomic DNA, 
Kohara clone 
#422(55.5-55-8 minA 

aenorhabduis 
eiegans cosmid 
F59B8, complete 
sequence 
[Caenorhabditis 
eiegans] 



P VALUE 



0.20 



Nearest Neighbor (BlastX vs. Non-Redundant ProteinO 



0.20 



ACCESSION 



<NONE> 



<NONE> 



M.musculus V alpha 
j 1.1 gene 5'-region 



516 



517 



AF040651 



U10470 



518 



D83778 



519 



S43579 



520 



U07357 



Caenorhabditis 
eiegans cosmid 
WQ4H10 

Pseudomonas 
fluorescens PHA 
depolymerase (phaZ) 
gene, complete cds. 



0.20 



<NONE> 



0.20 



DESCRIPTION 



<NONE> 



<NONE> 



<NONE> 



|<N0NE>1 



<NONE> 



<NONE> 



Human mRNAfor 
KIAA0194 gene, 
partial cds 



c-scr=pp60c-src, 
sdr=src downstream 

region 

Mus musculus Balb/c 
brain-specific kinase 
(Bsk) mRNA. 
complete cds. 



0.20 



0.20 



1170683 



3721862 



0.20 



126363 



0.20 



4159887 



0.20 



206712 



PHUSFHUKVLAShB 

KINASE ALPHA 
REGULATORY CHAIN, 
SKELETAL MUSCLE 
ISOFORM 

(PHOSPHOR YLASE KINASE 
ALPHA M SUBUNIT) 
>gi|2 1 35923 Jpir||I381 11 
phosphorylase kinase (EC 
2.7.1.38) - human >gi|791043 



(ABO 16024) Pfj2 [Plasmodium 
falciparum] 



LAM IN IN ALPHA- 1 CHAIN 
PRECURSOR precursor - 
human 



7.4 



1.9 



(ALU04908) similar to 

ribosomal protein L23a; similar 
to P293 16 (PDD;g 132848) 
Homo sapiens] 



0.65 



(M64793) salivary proline-rich 
protein [Rattus norvefficus] [ 0.51 



WO 01/02568 



PCT/US00/18374 



Nearest Neighbor (BlastN vs. Genbank) 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



SEQ 
ID 



ACCESSION 



521 



522 



523 



524 



DESCRIPTION 



P VALUE 



ACCESSION 



AF034460 



rerucuuum tnomu 
internal transcribed 
spacer 1, 5.8S 
ribosomal RNA gene 
and internal 
transcribed spacer 2, 
complete sequence; 
and 28S ribosomal 
RNA gene, partial 
sequence 



U95098 



X95971 



L41502 



Xenopus laevis 
mitotic 

phosphoprotein 44 
mRNA. partial cds 



S.Iividans groEL2 
gene 



DESCRIPTION 



0.20 



0.20 



Ovis aries 
vasopressin VI 
receptor (V1R) gene, 
complete cds 



0.20 



0.19 



114136 



2842674 



3925277 



P VALUE! 



AMINO-ACID 

ACETYLTRANSFERASE 

Pseudomonas aeruginosa 

>gi|15l036 (M38358) N- 

acetylglutamate synthase 

[Pseudomonas aerugi nosa] 
fUU DUMA11N U-AiA i. 



ASSOCIATING FACTOR 1 (B 
CELL-SPECIFIC 
COACTIVATOR OBF-1) (OCT 
BINDING FACTOR 1) (BOB- 
D(OCA-B) Bobl.B-cell- 
specifie - mouse 
>gi|18Sl8l8|bbs|179852 
mBobl=B-celI specific 
transcriptional coactivator line 
J558L, Peptide, 256 aa] 
>gi| 1 353792 (U43788) Oct 
binding factor 1 [Mus musculus] 



<NONE> 



(ALUJJ64_j) similar to 
Uncharacterized protein family 
UPF0034. Double-stranded 
RNA binding motif; cDNA EST 
vk4S9b3.5 comes from this 
gene; cDNA EST yk439g7.5 
comes from this gene 
[Caenorhabditis elegans 



0,39 



0.073 



<NONE> 



4e-19 



<NONE> 



525 



J03885 



526 



AE001451 



^pneumoniae 
oxalacetate 
decarboxylase alpha 
subunit gene, 
complete cds. 



0.19 



Helicobacter pylori, 
strain J99 section 12 
of 132 of the 
complete genome 



<NONE> 



<NONE> 



<NONE> 



0.19 



<NONE> 



<NONE> 



<NONE> 



WO 01/02568 
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SEQ 
ID 



Nearest Neighbor ( B lastN vs. Genbank) 



ACCESSION 



DESCRIPTION 



P VALUE 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



DESCRIPTION 



P VALUE 



527 



528 



D88084 



Pedicularis 
verticillata 
chloroplast DNA 
intergenic region 
between tmT(UGU) 
and trnL(UAA)5'exon 



U67599 



Methanococcus 
jannaschii section 141 
of 150 of the 
complete genome 



0.19 



<N0NE> 



0.19 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



529 



J05500 



Human beta-spectrin 
(SPTB) mRNA, 
complete cds. 



0.19 



<NONE> 



<NONE> 



530 



531 



YI0137 



M.mycoides ftsY 
gene homologue and 
gene encoding 
hypothetical protein 



0.19 



<NONE> 



AF027174 



Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath 
B) mRNA, complete 
cds 



0.19 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



532 



D43805 



Mouse thymic 
stromal cell mRNA 
for TLSF-beta, 
complete cds 



0.19 



<NONE> 



<NONE> 



<NONE> 



533 



AJ0I2585 



Tetrahymena 
thermophila 
macronuclear gene 
encoding ribosomal 
protein L3. exons 1-2 



534 



X51475 



irassica napus 5- 
enolpyruvylshikimate 
-phosphate synthase 
gene 



0.19 



0.19 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



535 



AF074386 



Sambucus nigra 
hevein-Iike protein 
mRNA. complete cds 



0.19 



<NONE> 



<NONE> 



536 



249625 



cerevisiae 
chromosome X 
reading frame ORP 
YJR125c 



0.19 



<NONE> 



<NONE> 



WO 01/02568 
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Nearest Neighbor (BlastN vs. Cenbank) 



Nearest Neighbor (BtastX vs. Non-Redundant Proteins) 



SEQ 
ID 



ACCESSION 



DESCRIPTION 



P VALUE 



,537 | X63741 



H.sapien s pilot 
mRNA 



0.19 



ACCESSION 
<NONE> 



D ESCRIPTIO N 
<NONE> 



P VALUE 



538 I Y 11255 



O.latipes mRNA for 
annexin max4 



0.19 



<NONE> 



<NONE> 



539 1 L63537 



Oncorhynchus mykiss 
(clone Jb-10) beta-2 
microglobulin (B2m) 
mRNA, complete cds 



0.19 



540 1 X70903 



541 | U61958 



N.tobacum T92 gene 
for auxin-binding 
protein 



<NONE> 



<NONE> 



0.19 



Caenorhabditis 
elegans cosmid 
C25A8 



<NONE> 



<NONE> 



0.19 



<NONE> 



<NONE> 



[<NONE> 



|<NONE> 



542 | U33959 



Macaca fascicularis 
fertilin beta mRNA, 
complete cds 



0.19 



<NONE> 



<NONE> 



543 | 249835 



H.sapiens mRNA for 
protein disulfide 
isomerase 



544 1 AP035458 



Spinacia oleracea 
heat shock 70 protein 
protein, complete cds 



0.19 



2113940 



(Z95556) hypothetical protein 
Rv2507 



0.19 



267293 



PROBABLE E4 PRbTElN " 
papillomavirus (type I) 
>gi|610l5 (X62844) E4 gene 
product fPygmy chimpanzee 
papillomavirus type 1] 



9.4 



545 | U23441 



Tetrahymena 
thermophila B 
internal deletion 
sequence. 



0.19 



546 I U53921 



547 | L 11002 



Pneumocystis carinii 
major surface 
glycoprotein 



3877185 



(266563) F46C3.2 
[Caenorhabditis elegansl 



0.19 



3548901 



Rat ankyrin binding 
glycoprotein- 1 related 
mRNA sequence, 



0.19 



3337352 



(AC004481) putative chromatin | 
structural protein Supt5hp 



9.3 



(AF052502) DA26 homolog 
Epiphyas postvittana 
nucleopolyhedrovirusl | 9.3 



548 



Methanococcus 
jannaschii section 102 
of 150 of the 
JJ67560 complete genome 



0.19 



3183689 



(Y13585) serotonin receptor 4 
[Cavia porcellus] 



549 



U 18424 



Mus musculus 
bacteria binding 
macrophage receptor 
MARCO mRNA. 
complete cds. 



0.19 



3659853 



(AF0S90S3) complement 
component CI qB like protein 



t<ri 



WO 01/02568 



PCTYUS00/18374 



SEQ 
ID 



Nearest Neighbor (BlastN vs. Genbnnlo 



ACCESSION 



550 | X66467 



551 1 AF003487 



, 552 | JQ5087 



J53 I AF080464 



J54 I U78876 



DESCRIPTION | p VALUE 



C.albicans sec 18 gene 



Syngaster lepidus 16S 
ribosomal RNA gene, 
partial sequence 



Rat calmodulin- 
sensitive plasma 
membrane Ca2+- 
transporting ATPase 
(PMCA3) mRNA, 
complete cds. 
Homo sapiens 
glutamate 
oxaloacetate 
transaminase 



Nearest Neighbor (BhstX vs. Non-Redundant Protein.O 



ACCESSION 



DESCRIPTION 



0.19 



1326385 



0.19 



3122039 



Human MEK kinase 
mRNA, complete 
cds 



555 1 AB009077 



556 I U95098 



Vigna radiata mRNA 
for proton 
pyrophosphatase, 
complete cds 



Xenopus laevis 
mitotic 

phosphoprotein 44 
mRNA. partial cds 



0.19 



0.19 



422462 



3024834 



0.19 



557 I AE000392 



Escherichia coli K-12 
MG1655 section 282 
of 400 of the 
complete genome 



1710445 



0.19 



3256922 



0.19 



4226159 



0.19 



(U5875I)C07G1.7 gene 



product (Caenorhabditis 
elegansj 



D1HYDROPYRIMIDINASE — 
(DHPASE) dihydropyrimidinase 



rat 



>gi|137SQ19|gril[PID|dl010479 



hypoihetical protein - fruit fly 
(Drosophila melanogaster) 
>gi|296434 (X68408) ORF 
[Drosophiia melanoeasterl 

PROBABLE E4 PROTEIN 
>gi|790898 position 3286.-3288 
is first start codon: putative 



(U78083) unknown [Emericella 
niduluns] 



(AP000002) 256aa long 
hypothetical protein 
Pyrococcus horikoshiil 



3645960 



(AF 125463) contains similarity 
to BTB (also known as BR 
C/Tik) domains (Pfam:PF0065I 
Score=62.8, E=7.6e-15, N=l) 
Caenorhabditis elesansl 

(AL031583) I- 

evidence=predicted by content; 
l-method=gcncfinder;084; 1- 
method_score=47.46; 1- 
evidence_end; 2- 
evidencc=predicted by match; 2- 
match_;iccession=SWISS- 
PROT:P23792; 2- 

match_description=DISCONNE 
CTED PROTEIN.: 2-matc... 



P VALUE 



6.9 



6.9 



5.3 



5.3 



5.3 



5.1 



4.1 



WO 01/02568 



PCT7US00/18374 



mm 



SEQ 
ID 



Nearest Neighbor (BlastN vs. Genbank) 



ACCESSION 



DESCRIPTION 



P VALUE 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



DESCRIPTION 



P VALUEl 



558 I AE000392 



559 | L81774 



Escherichia coli K-12 
MG1655 section 282 
of 400 of the 
complete genom e 
IHomo sapiens 
(subclone 3_dl from 
PI H25) DNA 
{sequence 



560 | AL021108 



Drosophila 
[melanogaster cosmid 
[clone 137E7 



0.19 



0.19 



3645960 



4001725 



uVLUjIsWJ 1- ■ 

evidence=predicted by content; 
l-method=genefinder;084; 1- 
method_score=47.46; 1- 
evidence_end; 2- 
evidence=predicted by match; 2 
match_accession=SWISS- 
PROT:P23792;2- 
match_description=DISCONNE 
CTED PROTEIN.; 2-matc... 



(AB0i598I)MnhA 
Staphylococcus aureus] 



561 I AB001510 



Carabus 
Jleptoplesioides 
mitochondrial DNA 
for NADH 
dehydrogenase 
subunit 5, partial cds 



0.19 



4001688 



(AB0I5718) protein kinase 
[Homo sapiens] 



jEgernia stokesii clone 
562 I j\F069696 [EST1 m icrosateliite 0.19 



0.19 



3758855 



563 | X64144 



564 | U56897 



IF.pringlei ppcAl 

gene for 
I phosphoenol pyruvate 
[carboxy lase 

iuman 
immunodeficiency 
virus type 1 gag 
polyprotein (gag) 
gene, partial cds 



565 1 U57975 



Danio rerio Notch 
homolosue 3 mRNA. 
complete cds 



3328994 



0.19 



0.19 



3242974 



2257710 



0.19 



3874971 



(298551) MAL3P6.11 
/Plasmodium falciparum] 



(AE001326) Amino Acid 
(Branched) Transport 
[Chlamydia trachomatis] 



(AF069555) G protein-coupled 
receptor p2y3 [Meleagris 
gallopavo] 



(U73041) resol vase-like protein 
Thiobacillus ferrooxidans] 



^vv/uy; similar to NAD 
dependant 

epimerase/dehydratase family; 
cDNAESTEMBL:C10103 
omes from this gene; cDNA 
EST EMBL:D66400 comes 
from this gene; cDNA EST 
EMBL:D70143 comes from this 
:ene; cDNA EST yk493h 11.3 
comes from ... 



4.0 



3.0 



3.0 



2.4 



2.4 



2.3 



2.3 



1.8 



WO 01/02568 



PCT/US00/18374 





Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neiehbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


f DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












masquerade precursor - fruit fly 




566 


Y12502 


R.norvegicus mRNA 
for factor Xllla 


0.19 


2133693 


(Drosophila melanogaster) 

>gi|665545 (U18130) 

masquerade [Drosophila 

melanogaster] 

><Til 1 095942 InrflGl 10286A 

masquerade gene 


1.8 


567 


S82470 


BBl=malignant cell 
expression-enhanced 
gene/tumor 
progression-enhanced 
gene [human, TJM- 
UC-9 bladder 
carcinoma cell line, 
mRNA, 1897 nt] 


0.19 


2444026 


(U77783) N-methyl-D-aspartate 
receptor 2D subunit precursor 

lr-Tnmf\ cnnifncl 


1 Q 


568 


U97408 


Caenorhabditis 
elegans cos mid 
F48A9 


0.19 


542433 


225 K protein - Babesia bovis 
(fragment) 


1.8 


569 


U10470 


Pseudomonas 
fluorescens PHA 
depolymerase (phaZ) 
gene, complete cds. 


0.19 


3721862 


(ABO 16024) Pf]2 [Plasmodium 
falciparum] 


1.7 


570 


M88160 


Ovis aries MAF214 
locus polymorphic 
di nucleotide repeat . 


0.19 


1293816 


(U56963)T13AI0.5 gene 
product [Caenorhabditis 
elegans | 


1.4 






mRNA for pollen 
allergen (Hoi 1 2, 
group II) > 
emb|AJ131339|LITI3 
1339 Lolium italicum 
mRNA for pollen 
allergen (Lol i 2, 
group II) > allergen 
(Poa p 2, group II) > 




• 






571 


< 
I 

AJ131336 : 


;mb|AJl3l33S|TAEl ' 
31338 Triticum 
lestivum mRNA for 
Dollen allergen (Tri a 
I. group II) 


0.19 


( 

3880447 ( 


AL032675) predicted using 
jene finder 


0.82 


572 


< 

X84036 ; 


S.cerevisiae ARG8 
ind CDC33 senes 


0.19 


3882041 ( 


AJ0I0405) hypothetical protein 


0.62 



^01 



WO 01/02568 



PCT/US00/18374 





Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Human WD protein 






mucin - human >gi|501033 




573 


U57058 


DUO pre-mRNA, 
partial cds 


0.19 


631302 


(U14383) mucin [Homo 
sapiens] 


0.60 


574 


AF034460 


Penici Ilium thomii 
internal transcribed 
spacer l, _>.oo 
ribosomal RNA gene 
and internal 
transcribed spacer 2, 
complete sequence; 
and 28S ribosomal . 
RNA gene, partial 
sequence 


0.19- * 


114136 


AMINO-ACID 
ACET YLTRANSFERAS E 
Pseudomonas aeruginosa 
>gi|151036(M38358) N- 
acetylglutamate synthase 
[Pseudomonas aeruginosa] 


0.35 


575 


U95098 


Xenopus laevis 
mitotic 

phosphoprotein 44 
mRNA, partial cds 


0.19 


105270 


alpha-2radrenergic receptor - 
human name ADRA2R' [Homo 
sapiens] 


0.27 


576 


AGO01475 


Homo sapiens 
genomic DNA, 2 lq 
region, clone; 
125H6N2 


0.19 


94977 


hypothetical protein 3 - 
Pseudomonas sp. (DSM 6898) 
piasmid pKB740 >gi|45867 
(X66604) ORF3 


0.16 


577 


M63284 


Mouse IgG receptor 
(beta-Fc-gamma-Ril) 
gene, exons 9 and 10, 
clones lambda- 
Fc(3.2,93). 


0.19 


3024681 


TRANSCRIPTION 
INITIATION FACTOR TFIID 
135 KD S UB UNIT (TAFTI- 1 35) 
<TAFII135)(TAFIM30) of 
RNA polymerase II transcription 
factor TFIID [Homo sapiens] 


0.088 


578 


U38241 


Pseudomonas 
aeruginosa orotate 
phophoribosyl 
transferase (pyrE), 
catabolite repression 
control protein (crc) 
and RNasePH (rph) 
genes, complete cds 


0.19 


3044086 


(AF055904) unknown 
[Myxococcus xanthus] 


0.052 


579 


AF039734 


Lontra longicaudis 
transthyretin intron I, 
panial sequence 


0.19 


322759 


pistil extensin-like protein 
[clone pMG!4) - common 
obacco (fragment) >gi|19927 
(214015) pistil extensin tike 
protein [Nicotiana tabacum] 


0.030 


580 


U95094 


Xenopus laevis XL- 
[NCENP (XL- 
INCENP) mRNA. 
:omplete cds 


0.19 


2147194 |c 


rollasen - Paralvinella grasslei 


0.002 



WO 01/02568 



PCT/US00/18374 



Nearest Neighbor (BlastN vs. Cenbank) 



Nearest Neig hbi 



orfBlastX vs. Non-Redundant Proteins! 



SEQ 

ID I ACCESSION 



DESCRIPTION 



P VALUE 



ACCESSION 



DESCRIPTION 



p value! 



581 | AB 004232 



Drosophila 
melanogaster mRNA 
for DAD polypeptide 
co mple te cds 



0.19 



582 | AF098919 



Gallus gall us alpha- 
globin gene domain 5 
recion 



2498765 



PEROXISOMAL MEMBRANE 
PROTEIN PEX16 lipolytica) 



0.19 



1086863 



(U41272)T03G1I.6 gene 
product [Caenorhabditis 
clegansj 



0.002 



4e-05 



583 | AE001457 



584 1 LI0329 



585 | AE001155 



Helicobacter pylori, 
strain J99 section 18 
of 132 of the 
complete genome 
Plasmid RP4 traE 
gene, 3' end; traD 
gene, complete cds; 
trap gene, 5' end. 



0.19 



2924552 



(AL022018) l- 

evidence=predicted by content; 
l-method=genefinder;084; 1- 
method_score= 165.48; l- 
evidence_end; 2- 
evidence=predicted by match; 2 
match_accession=AA264666; 2 
match_description=LD08351.5p 
rime LP Drosophila melanoga... 



0.19 



3878117 



(Z49068) mitochondrial carrier 
protein 



Borrclia burgdorferi 
(section 41 of 70) of 
the complete genome 



586 I U49979 



Ort virus El OR 
homolog gene, partial 
cds, and DNA 
polymerase gene, 
complete cds 



0.19 



861276 



0.19 



3850072 



(U28739) similar to TPR 
domains in e.g. yeast STT1 
protein [Caenorhabditis elegansj 



(AL033385) dna-directed ma 
polymerase iii subunit 
fSchizosaccharomyces pombe] 



3e-05 



8e-07 



2e-12 



le-15 



587 1 U88155 



Xenopus laevis 
RanGTPase 
activating protein 



0.19 



995714 



(X91258) pid:el98503 
Saccharomyces cerevisiael 



588 | AF061854 



Schizosaccharomyces 
pombe CIr4p (clr4) 
gene, complete cds 



0.19 



3242750 



(AC005 164) match to ESTs 
AA731 149 (NID:g2I40l38), 
AA73I90S (NID:g2752719), 
AA2S7S37 (NED:g 19335 19), 
AA262811 (NID:gl898382), 
and AAS25820 (NID:g2899132) 



589 I M23865 



S.cerevisiae CHS2 
gene encoding chitin 
synthase. 



0.1S 



cNONE> 



<NONE> 



<NONE> 



WO 01/02568 



PCT/US00/18374 



Nearest Neighbor (BlastN vs. Genbank) 



[ SEQ 
ID I ACCESSION 



590 I U95094 

591 | AF0676ID 



,592 | AF036329 



DESCRIPTION 



Xenopus laevis XL- 



P VALUE 



INCENP (XL- 
INCENP) mRNA, 
complete cds 
Caenorhabditis 
elegans cosmid 
F4IA4 



-iomo sapiens 
gonadotropin- 
releasing hormone 
precursor, second 
form (GnjRH-II) gene 
complete cds 



0.18 



0.18 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



<NONE> 



<NONE> 



593 I Z49216 



594 I X02167 



595 | Z31561 



596 | L81692 



597 1 X573 10 



598 | U18315 



Lsapiens 
mitoxantrone 
resistance associated 
mRNA 



Torulopsis glabrata 
mitochondrial DNA 
for tRNA-Thr.-His 
and -GIu upstream of 
cytochrome b gene 



communis 
(Carmcncita) Scri 
mRNA for sucrose 
carrier 



-fomo sapiens 
(subclone 2_c9 from 
PI H56) DNA 
sequence 



Nocardia 

lactamdurans pcbAB 
and pcbC genes for 
a!pha-aminoadipyI-L- 

ysteinyl-D- valine 
synthetase and 
isopenicillin N 
s ynthase 

us scrofa 



0.18 



<NONE> 



0.18 



<NONE> 



DESCRIPTION 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



0.18 



<NONE> 



0.18 



<NONE> 



0.18 



1346575 



parathyroid receptor 
(PTH) mRNA, 
complete cds 



0.18 



126404 



0.1S 



1022323 



<NONE> 



<NONE> 



55 KD ERYTHROCYTE 
MEMBRANE PROTEIN 



P VALUE 



<NONE> 



<NONE> 



<NONE> 



SEED LIPOXYGENASE- 2 (L 
) soybean >gi|170014 (J0321 1) 
poxygenasc (EC 1.13.11.12) 



(X04647) collagen alpha-2(IV) 
chain [Mus musculus] 



8.4 



6.5 



3.8 



WO 01/02568 



PCT/US00/18374 



Nearest Neighbor (BtastN v s. Genbank) 

! SEQ - 

*P I ACCES SION DESCRIPTION p VALUE 



599 | AL010158 



Plasmodium 
falciparum DNA *** 
SEQUENCING IN 
PROGRESS *** 
from contig 3-85. 
complete sequence 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



DESCRIPTION 



600 1 AB005287 



601 | AL021108 



Bos taurus mRNA for 
thrombospondin 1, 
complete cds 



Drosophila 
melanogaster cosmid 
clone 137E7 



0.18 



2506816 



0,18 



2146000 



IP VALUEI 



0.18 



3483032 



VfcRMLAN LURL PRU IfclfT 
PRECURSOR 
PROTEOGLYCAN CORE 
PROTELN 2) (GLIAL 
HYALUR ON ATE -BINDING 
PROTEIN) (GHAP) >gi|6085I5| 
(U 1 6306) chondroitin sulfate 
proteoglycan versican VO splice-l 
variant precursor peptide 
uuoo^b protein - 

Mycobacterium tuberculosis 
tuberculosis] 

>gi| 1 694863 |gnl|PID|e283373 
(ZS301S) hypothetical protein 
Rv2968c [Mycobacterium 
tuberculosis] 



(AL031371) hypothetical 
protein SC4G2.06 
[Streptomvces coelicolor] 



3.7 



2.9 



602 | U57975 



Danio rerio Notch 
homologue 3 mRNA, 
complete cds 



0.18 



85719 



collagen alpha l'(II) chain 
precursor - A frican clawed frog 



603 | M30124 



6041X54965 



_605 | U95098 



P.aeruginosa 
autonomously 
replicating sequence. 



G.sp alpha 5HR DNA 



0.18 



3878017 



(ALU-' I jS7) similar to Zinc" 
finger, C4 type (two domains); 
cDNA EST yk452f4.5 comes 
from this gene; cDNA EST 
EMBLT00774 comes from this | 
gene receptor NHR-3 
[Caenorhabditis elegans] 



Xenopus laevis 
mitotic 

phosphoprotein 44 
mRNA. partial cds 



606 I U20793 



Oryctolagus 
cuniculus renal 
sodium-dependent 
phosphate transporter 
type II mRNA, 
complete cds. 



0.18 



134304 



STEM CELL PROTEIN 
chicken >gi|62845 (X63371) 
transforming capacity [Gal lus 
callus | 



0.1S 



1628403 



0.1S 



1705984 



(X98S93) hTAFII68 [Homo 
sapiens] splicing [Homo 
sapiens] 



92 KD TYPE IV 
COL L AG EN AS E 
PRECURSOR IV, 92K, 
precursor - rat >gi|l022784 
(U36476)92-tcDa type IV 
eollauenase [Rattus norvegicus] 



1.3 



1.3 



1.3 



1.2 



WO 01/02568 



PCTYUSOO/18374 



SEQ 
ID 



Nearest Neighbor f BlastN vs. Genhan^ 



ACCESSION 



DESCRIPTION 



607 | U23427 



608 | U49953 



609 1 J00182 



610 1 X625 13 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



_611 I X04862 



Human 
cholecystokinin type 
A receptor (CCK-A) 
:ene. exons 1 and 2. 

Rartus norvegicus 
protein kinase MUK2 
mRNA. complete cd s 
rluman alpha globin 
gene cluster on 
chromosome 16: zeta 
ene. 



M.gaJIopavo gene for 
metallothionein 



P VALUE I ACCESSION 



Goat embryonic alpha 
globin gene zeta 
exons 2-3 



0.18 



0.18 



0.18 



0.18 



3261734 



551238 



1585259 



2494740 



612 1 M12450 



_613 | AF038539 



614 1 X78401 



615 1 D38754 



Rat vitamin D 
binding protein 
(DBP) mRNA, 
complete cds. 



Mus musculus muscle 
NSP-like I (Nspll) 
mRNA, complete cd s 

Bacteriophage P22 
right operon, orf 48. 
replication genes IS 
and 12, nin region 
genes, ninG 
phosphatase, late 
control gene 23, orf" 
60, complete cds, late 
control region, start 
of Ivsis gene 13 

Pig mRNA for inter- 
alpha- trypsin 
inhibitor heavy-chain 
HI. complete cds 



0.18 



DESCRIPTION 



(Z94752) hypothetical protein 
Rv 1004c 



[XS1347) pectate lyase 1 
Erwinia carotovoral 



traJ gene [Amycolatopsis 
mejhanolica] 
HYPOTHE 
PROTEIN IN GBD 5REGION 
(ORF4) >gi|2I20954|pir)|139562 
ORF4 - Alcaligenes eutrophus 
gi|695274 (L36817)ORF4 



P VALUE 



86837 androgen receptor B - human 



0.18 



4210432 



0.18 



3297877 



(AJ 130783) APC2 protein [Mus 
musculus) 



(AJ224S6S) GNAS1 [Homo 
sapiens! 



0.1S 



0.13 



1123087 



1397275 



(U42436) C49H3.3 gene 
product [Caenorhabditis 
elegansl 



(U61947) C06G3.8gene 
product [Caenorhabditis 
elegansl 



0.97 



0.43 



0.41 



0.31 



0.082 



0.038 



0.029 



0.009 



WO 01/02568 



PCT/US00/18374 





Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ED 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












LRR47 protein - fruit fly 




616 


X51508 


Rabbit mRNA for 
aminopeptidase N 
(partial) 


0.18 


630864 


(Drosophila melanogaster) 
>gi|4 15947 (X75760) LRR47 
[Drosophila melanogaster] 


6e-07 


617 


X54850 


S.kluyveri linear 
plasmid pSKL DNA 
for open reading 
frames 1-10 


0.18 


3183405 


HYFUlHhl'lLAL FlJ KJD 
PROTEIN C2C6.07 IN 
CHROMOSOME I 
>gi|2370504|gnl|PID|e339194 
pombe] 

>gi|345 1 305|gnl|PID|e 13 16730 
(AL031324) very hypothetical 
protein [Schizosaccharomyces 
pombe] 


2e-08 


618 


L21954 


Human peripheral 
benzodiazepine 
receptor gene, exon 4. 


0.18 


3925211 


(AJ-U j^OZOJ CUINA i 

EMBL:D70654 comes from this 
gene; cDNA EST 
EMBL:2 14359 comes from this 
gene; cDNA EST 
EMBL:D33409 comes from this 
gene; cDNA EST 
EMBL:D36239 comes from this 
gene; cDNA EST 
EMBL:Z14766 comes from this 
aene... 


4e-09 


619 


U09355 


Oryctolagus 
cuniculus protein 
phosphatase 2AI B 
gamma subunit 
(skeletal muscle 
isolate) mRNA, 
complete cds. 


0.1S 


3947877 . 


(AL034382) putative mitosis 
and maintenance of ploidy 
protein [Schizosaccharomyces 
pombe | 


8e-ll 


620 


X58715 


T.cruzi hsp70 mRNA 
for 70 kDa heat shock 
protein, partial cds 


0.18 


3024081 


MYOSIN LIGHT CHAIN 
KINASE, SMOOTH MUSCLE 
AND NON-MUSCLE 
ISOZYMES (MLCK) 
(CONTAINS: TELOKJK) 


9e-12 


621 


AF06O195 


Mus musculus 
proteasome regulator 
PA28 beta subunit 
gene, complete cds 


0.18 


861276 


(U2S739) similar to TPR 
domains in e.g. yeast STI1 
protein [Caenorhabditis eleeans] 


le-14 


622 


L27235 


Methylo bacterium 
extorquens serine 
cycle proteins 


0.18 


2688949 


(AF02720S) AC 133 antigen 
Homo sapiens] 


lc-14 



WO 01/02568 
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SUSS 


Nearest Neighbor CBIastN vs. Genbank) 


Nearest Neiehbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















623 


Af 006573 


Drosophila virilis 
maltase 1 (Mavl) and 
maltase 2 (Mav2) 
genes, complete cds 


0.18 


2500558 


PUTATIVE RIB ONUCLE ASE 
III (RNASE III) 
>gi|3876420|gnl|PCD|el346063 
(28 1070) similar to ribonuclease 
[Caenorhabditis elegaus] 


2e-23 


624 


AF001782 


Staphylococcus 
aureus strain SA502A 
AgrB 


0.17 


<NONE> 


<NONE> 


<NONE> 


625 


AJ223364 


Homo sapiens germ- 
line DNA upstream ol 
Jkappa locus 


0.17 


<NONE> 


<NONE> 


<NONE> 


626 


J03059 


Human 

glucocerebrosidasc 
(GCB) gene, 
complete cds 


0.17 


<NONE> 


<NONE> 


<NONE> 


627 


AB008860 


Fugu rubripes Cal2 
gene for pheromone 

cds 


0.17 


2198849 


(AHJU4yUUJ LJKARF [Homo" - 
sapiens] >gi|2665826 
(AF035771) Na+/H+ exchanger 
regulatory factor 2 [Homo 
sapiens] factor 2 [Homo 
sapiens] 

>gi|3613353|gni|PED|dl034182 
exchanger isoform A3 [Homo 
sapiens] 


7.8 


628 


AF027174 


Arabidopsis thai i ana 
cellulose synthase 
catalytic subunit (Ath- 
B) mRNA, complete 
cds 


0.17 


539355 


SCD25 protein (version 1) - 
veast 


7.5 


629 


< 
< 

AF059650 c 


Homo sapiens histone 
ieacetylase 3 
HDAC3) gene, 
:omplete cds 


0.17 


1 

482118 ( 


lypothetical protein C15H7. 1 - 
laenorhabditis clegans 


4.5 
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[Homo sapiens histone 
deacetyiase 3 
(HDAC3) gene, 
630 AFQ59650 complete cd* 



631 I X55Q65 



jChinese hamster 
metallothionein II 
Jgcne 



o.i: 



465932 



IRattus norvegicus 
oxytocin receptor 
(OTR) gene, exon 3 
632 1 U15 280 and comp lete cds 



633 



[Goat embryonic alpha 
Iglobin gene zeta 
X04862 jex ons 2-1 

Wasmodium 



Jfalciparum DNA ** ! 
SEQUENCING IN 
[PROGRESS *** 
[from contig 4-09, 
,634 1 AL01Q 222 complete seq uence 



0.17 



3687237 



0.17 



542565 



0.17 



86837 



. .H.sapiens mRNA for 

635 | X60U 1 MRP- 1 
[Grt 1 virus El OR 
|homo!og gene, partial 
Icds, and DNA 
'polymerase eene, 

636 | U49979 [complete cds 



0.17 



1177322 



PROTEINF58A4.il IN 
CHROMOSOME III 

>gi|3874287|gnl|PID|el344088 
ESTEMBL:C12577 comes 
from this gene; cDNA EST 
yk227e7.5 comes from this 
gene; cDNA EST yk303dl.5 
comes from this gene; cDNA 
ESTyk314cl2.5 comes from 
this gene;cDNA ... 
EMBL:C1 1886 comes from this 
gene; cDNA EST 
EMBL:CI2577 comes from this 
gene; cDNA EST yk227e7.5 
comes from this gene; cDNA 
EST yk303d 1.5 comes from this 
gene;cDNA EST yk3 14c 12.5 
comes from this gene; cDNA ... 



(AC005 169) putative Cys3His 
zinc-finger protein 
cyclin E type U - fruit fly 
(Drosophila melanogaster) 
>gi|429l68 (X75027) 
Drosophila cyclin E type II 
grosophita melanogasterl 



4,4 



L5 



androgen receptor B - human 



(X95466) CPG2 protein [Rattus 
norvegicus] 

>gi|l58S593|prf||2208498A 
plasticity-related gene [Rattus 
norveuicus] 



0.17 



0.17 



3237306 



3850072 



(U927 15) breast cancer 
antiestrogen resistance 3 prote in 

(AL033385) dna-directed ma 
polymerase iii subunit 
Schizosaccharomvces pombel 



0.45 



0.080 



7e-07 



3e-09 



7e-15 
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SEQ 
m 


Nearest N 

A f* C P Q Q T/*\W 


piohhnrfBlastN vs. Ue 

nFsrRiPTION 


nbank) 
P VALUE 


Nearest Neighbo 
ACCESSION 


r (BlastX vs. Non-Redundant Pro 
DESCRIPTION 


teins) 
P VALUE 


637 


AF006573 


Drosophila virilis 
maltase 1 (Mavl)and 
maltose 2 (Mav2) 
genes, complete cds 


0.17 


2500558 


PUTATIVE RIBONUCLEASfc 
III (RNASE m) 
>gi|3876420|gnl|PID|el346063 
(Z81070) similar to ribonuclease 
[Caenorhabditis elegansl 


2e-29 


638 


AE0OH41 


Borrelia burgdorferi 
(section 27 of 70) of 
the complete oenome 


0.16 


1850327 


(U52370) fertilin beta [Homo 
sapiens] 


2.3 


639 


M72980 


Anthonomus grandis 
vitellogenin gene 
(VTG), complete cas. 


0 P 


3242750 


(AC005 164) match to ESTs 
AA73U49 (NID:g2l40i38), 
AA731908 (NID:g2752719), 
AA287837 (NID:g 19335 19), 
AA2628ll(NID:gl898382) t 
and AA825820 (NID:g2899132) 


2c-56 


640 


AF023532 


Simulium vittatum 
ATPase 6 gene, 
mitochondrial gene 
encoding 
mitochondrial 
protein, partial cds 


0.11 


<NONE> 


<NONE> 


<NONE> 


641 


U76523 


Sambucus nigra lectin 
precursor mRNA. 
complete cds 


0.10 


3482965 


(AL03 1369) putative protein 


0.49 


642 


AJ001596 


Danio rerio mRNA 
for opioid receptor 
homolosue 


0.099 


1706694 


LANOSTEROL SYNTHASE 
5.4.99.7) - fission yeast 
(Schizosaccharomyces pombe) 


2.3 


643 


U26341 


Oryctolagus 
cuniculus Na and CI 
dependent betaine 
transporter mRNA, 
complete cds. 


0.099 


2645804 


(AF033381) betaine . 
homocysteine methyl transferase 
[Mus musculusl 


0.59 


644 


Ml 1633 


Bacteriophage Cp-5 
(S. pneumoniae) 3' 
inverted terminal 
repeat. 


0.032 


2314695 


(AE000649) type IIS restriction 
eozvme R and M protein 


4.3 


645 


X74103 


Streptomyces sp. 
gene for alkaline 
serine protease I 


0.073 


1314734 


(U54641) 220 kDa silk protein 
[Chironomus thummil 


6.3 
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Nearest Neighbor fBlastN vs. Gcnbankj 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



SEQ 
ED 



ACCESSION 



DESCRIPTION 



P VALUE 



ACCESSION 



DESCRIPTION 



P VALUE 



Caenorhabduis 



646 | Z72509 



elegans cosmid 
F32G8. complete 
sequence 
[Caenorhabditis 
elegans] 



0.072 



<NONE> 



<NONE> 



<NONE> 



647 I X70282 



X.Iaevis xanf-1 aene 



648 I 269906 



Human DNA 
sequence from 
cosmid E14IE2, on 
chromosome 22, 
complete sequence 
'Homo sap tens 1 



0.070 



3851202 



(AC005954) ZO-3 [Homo 
sapiens] [Homo sapiens] 



0.40 



0.069 



<NONE> 



<NONE> 



<NONE> 



649 1 AFQ56940 



Drosophila virilis 
retrotransposon Tvl, 
complete sequence 



0.069 



2246532 



(U93872) ORP 73, contains 
large complex repeat CR 73 



5c- 12 



650 | AJ001I51 



Homo sapiens 
enomic sequence 



0.06S 



65l|X54455 



Bacteriophage BF23 
;ene 17 and gene 18 



0.067 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



652 1 X87936 



pinea internal 
transcribed spacers 1 
& 2 ofribosomal 

DNA 

Dicrvostelium 



0.067 



2459733 



(U95374) aldehyde 
dehydrogenase [Haloferax 
volcanii] , 



4.3 



discoideum TipD 
(tipD) aene, complete 
653 | AF019236 cds 



0.067 



3882275 



(ABO 18320) KIAA0777 protein 
Homo sapiens] 



l.l 



654 1 X90592 



O.cuniculus mRNA 
for p53 protein 



0.067 



1703275 



METHIONINE 
AMINOPEPTIDASE 2 
(METAP 2) GLYCOPROTEIN) 
(P67) 



0.29 



655 I U41805 



Mus muscuius 
putative T1/ST2 
receptor binding 
protein precursor 
mRNA. panial cds 



0.067 



656 I AB007881 



Homo sapiens 
KIAA0421 mRNA, 
>artial cds 



642518 



(U 17326) neuronal nitric oxide 
synthase [Homo sapiens] 



0.066 



<NONE> 



<NONE> 



0.29 



<NONE> 



657 I AL010213 



lasmodium 
falciparum DNA ** J 
SEQUENCING IN 
PROGRESS *** 
from contig 3-109, 
complete sequence 



0.066 



<NONE> 



<NONE> 



<NONE> 
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i Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant PmteinO 


SEQ 
ID 


ACCESSION 


r DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


n xr at r rc 

r VALUc 
















658 


ABO 14546 


Homo sapiens mRN^ 
for KIAA0646 








0.38 


659 


AFI04156 


Rattus exulans isolate 
huahine30 

m i tnr* honHrin 1 F5- 

loop, panial sequence 


0.066 


1002380 


(U24189) RRM-rype RNA 
binding protein [Caenorhabditis 
elegansl 


0.29 


660 


X97581 


M.musculus mRNA 
for spalt transcription 
factor 


0.066 


4107313 


(AL035075) putative myosin 
heavy chain 


0.28 


661 


D85378 


Human clone H20 N- 

acetvl a lucn^am i n vltra 

nsferase III DNA, 
exon 2 


0.066 


2114473 


(U96963) p!40mDia [Mus 
musculus] 


0,22 


662 


M97561 


Human (clone 

LA 179) chromosome 

21 sequence. 


0.065 


<NONE> 


<NONE> 


<NONE> 


663 


AEOOI373 


Plasmodium 
falciparum 
chromosome 2. 
section 10 of 73 of 
the complete 
sequence 


0.065 


<N0NE> 


<NONE> 


<NONE> 


664 


S75479 


growth hormone 
receptor, growth 
hormone binding 
protein {GHR/BP 
gene } [mice, C57 
black/6, Genomic, 
179 nt, segment 8 of 
10] 


0.065 ' 


<NONE> 


<NONE> 


<NONE> 


665 


AF032922 


Homo sapiens 
syntaxin 4 binding 
protein UNC-18c 
(UNC- 18c) mRNA, 
complete cds 


0.065 


3061308 


[AB006074) topoisomerase III 
[Mus musculusl 


0.82 


666 


i 

< 

SS09S6 : 


svp[40J=svp-related 
nuclear 

receptor/retinoid 
signaling modulator 
zebra fishes. mRNA. 
3S76 nt] 


0.065 


( 

132628S < 


U58734) weak similarity to 
mkyrin G [Caenorhabditis 

ilegans] 


0.12 



WO 01/02568 



PCTVUS00/18374 



[S^bI Nearest Neighbor ( BlasiN vs. Genbankl 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



|SEQ 
ID I ACCESSION 



667 | X59552 



668 I M72980 



DESCRIPTION 



P VALUE 



ACCESSION 



G.domesticus raRNA 
for ventricular myosin 
heavy chain 



Anthonomus grandis 
vitellogenin gene 
(VTG), complete cds 



0.065 



0.065 



DESCRIPTION 



2497098 



3242750 



HKFUlHfcliLAL 14.2. KU 
PkU I bLN LN AMD 1 -KADblZ 
INTER GENIC REGION 
>gi|1077l80|pir||S49745 
probable membrane protein 
YML034w - yeast 
(Saccharomyces cerevisiae) 
>gi|575685 (Z46659) unknown 
orf.Ien: 656, CAI: 0.13 
Saccharomyces cerevisiae] 



(AC005164) match to ESTs 
AA731149 (NID;g2 140138), 
AA73I908 (NID:g2752719), 
AA287837 (NID:g 19335 19), 
AA262811 (NID:gl898382) t 
and AA825820 (NID:g2899132) 



P VALUE! 



0.014 



5e-33 



669 | ABO 14546 



Homo sapiens mRNA 
for KIAA0646 
?rotefn. complete cds 



0.064 



<NONE> 



<NONE> 



<NONE> 



670 | M30039 



Sheeppox virus strain 
KS-l ORFHM1 
gene, panial cds; 
ORF HM2 and ORF 
HM3 genes, complete 
cds; and ORF HM4 
gene, partial cds 



671 | Z68013 



'aenorhabditis 
elegans cosmid 
W02H3. complete 
sequence 

Caenorhabditis 
elegans] 



0.064 



0/064 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



672 I AFQ41332 



Bodo saltans 
unknown mRNA, 
kinetopiast gene 
encoding kinetopiast 
protein, complete cds 



0.064 



<NONE> 



<NONE> 



<NONE> 



673 | J0045I 



Mouse gcrmline IgG- 

chain gene, D-J-C 
region, and switch 



0.064 



<NONE> 



<NONE> 



<NONE> 
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SEC 
ID 

674 


Ncaresi 

I 

_ ACCESSIOf 


Neighbor fBlastN vs. 
* DESCRIPTION 


Genbank) 
P VALUE 


Nearest Neiph 
ACCESSION 


bor (BlastX vs. Non-Redundant F 
DESCRIPTION 


Voteins) | 

p value! 


U41289 


Dicryostclium 
discoideum K7 
kinesin-Iike protein 
mRNA, complete cds 


0.064 


3482972 


(AL03 1369) putative protein 


9.3 J 


675 
676 


M37395 
Z 15030 


L.Iactis (strain SKI I) 
proteinase plasm id 
pSKIIIDNA, 
complete cds. 

H.sapiens gene for 
ventricular myosin 
light chain2>:: 
gb|L01652|HUMVM 
LC Human 
ventricular myosin 
light chain 2 gene, 
seven exons. 


0.064 
0.064 


993019 
730343 


(X87246) alternative start codon 
Pseudorabies virus] 

PRECURSOR (PRL-R) mouse 
>gi|220576|gnI|PID|d 100 1535 
(D10214) prolactin receptor 
precursor [Mus musculus] 
>gi|293770 (L148 11) prolactin 
receptor [Mus musculus] 
>gi|347842 (L13593) prolactin 
receptor [Mus musculus] 
receptor:ISOTYPE=Iong form 
Mus musculus] 


1 

9.2 J 

9.1 




677 
678 


( 

Z12021 

P 
s 
k 

L05668 c 


Imax gene for 

ataiase 

Entamoeba histolytica " 
ratein 

erine/threonine 
inase (pstkl) gene, 
omplete cds. 


0.064 
0.064 


( 
r 
( 

2498711 F 

( 

733140 f. 


ORIGIN RECOGNITION 
COMPLEX PROTEIN, 
SUB UNIT 2>gi|l ISS461 
U3S472) essential ORC2- 
elated fission replication factor 
3rp2 [Schizosaccharomyces 
>ombe] 

U22453) carboxypeptidase 
Simulium vittatum] 


5.3 

53 1 
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| Nearest Neighbor f BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ED 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















679 


U50715 


Mus musculus alpha- 
galactosidase A gene, 
complete cds 


0.064 


125398 


HYGROMYCEN-B KINASE 
(HYGROMYCIN B 
PHOSPHOTRANSFERASE) 
(APH(7")) 

>gi|66885|pir||WGSMHH 
hygromycin B 

phosphotransferase (EC 2.7.1.-) ■ 
Streptomyces hygroscopicus 
>gi|58l682 (X03615)pot. hyg 
protein [Streptomyces 
hygroscopicus] 
phosphotransferase [synthetic 
construct] >gi|2739064 cloning 
vector] >gi|2739068 
(AF025747) hygromycin B 
phosphotransferase [unidentified 
cloning vector] 


2.3 


680 


Z281S2 


S. cere vis iae 
chromosome XI 
reading frame ORF 
YKL182w 


0.064 


1079035 


Om{2D) protein - fruit fly 
(Drosophila ananassae) 
>gi|443770|gnI|PlD|d 1006095 
(D26553) ORF 


1.8 


681 


M299I7 


Human ornithine 
aminotransferase 
gene, exon 1. 


0.064 


2317934 


(U97553) unknown [murine 
herpesvirus 68] 


1.4 


682 


AB020709 


Homo sapiens mRNA 
for KIAA0902 
protein, complete cds 


0.064 


861404 


(U29I54) T07FI2.3 gene 
product [Caenorhabditis 

elegans] 


0.47 


683 


ABO 14546 


Homo sapiens mRNA 
for KIAA0646 
protein, complete cds 


0.064 


1708118 


HOMEOBOX PROTEIN HB9 
>gi|507425 • 


0.35 


684 


ABO 10427 


Homo sapiens mRNA 
for NORM, complete 
cds 


0.064 


2388676 


(AF015539) precoIlagenP 
fMytilus edulis] 


0.018 


685 


U34774 


Orf virus ankyrin-Iike 
repeat protein, F11L 
homolog, and F12L 
homolog genes, 
complete cds. 


0.064 


731668 


SSFl PROTEIN 
>gi|626624|pir||S46700 SSFl 
protein - yeast (Saccharomyces 
cerevisiae) 


le-05 


686 


AF022861 


Mus musculus 
neuropilin-2(a5) 
mRNA. alternatively 
spliced, complete cds 


0.064 


4091978 


(AF073359) benzaldehyde 
dehydrogenase [Pseudomonas 

sp. bni} 


le-05 



1A6 
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^•Hk Nearest Neighbor f BlastN v< 
SEQ \ 
ID ACCESSION DESCRIPTION 


GenbanlO 
1 P VALU] 


Nearest Neil 

E ACCESSION 


,hbor(BlastX vs. Non-Redundant 
1 DESCRIPTION 


Proteins) "H 

p value! 


1 ISus scrofa myogenir 
687 I U1433 1 Lcnc, complete cds 


i 

0.064 


2781386 


{ALU04010) similar to LeuciS 
ncn transmembrane proteins; 
44% similarity to U42767 
(PID:gl7369l8) [Homo 
sapiens] 


3e-33 I 


1 Chironomus 

J paJlidivittatus clone 

1 112 19 non-telomenc 

688 AF074870 Ssp repeat ^r,,,^ 
1 IH.sapiens repeat 

689 1 Z25523 reaionDNA. 


0.063 
0.063 


<NONE> 
<NONE> 


<NONE> 

<NONE> 


<NONE>| 
<NONE> 1 


1 IfaJciparum 
1 [chromosome 2, 
| section 15 of 73 of 
I Che complete 
690 AE001378 sequence 


0.063 


<NONE> 


<NONE> 


<NONE> 1 


1 IS.cerevisiae 
1 1 chromosome VII 
1 reading frame ORE 
691 1 272947 YGR162w 


0.063 


<NONE> 


<NONE> 




1 (Choanomphalus 
I lincertus 
1 mitochondrial 
1 [cytochrome c oxidase 
692 I Y 14723 subunit I 2ene. partial 


0.063 


<NONE> 


<NONE> 


<NONE>| 
<NONE> 1 


_693 1 X74103 

J 
1 

694 1 AE039843 ft 


Streptomyces sp. 
gene for alkaline 
serine protease I 

Homo sapiens 
Sprouty 2(SPRY2) 
tiRNA. complete cds 


0.063 
0.063 


1 

) 
c 

( 

1730713 f 

T 

(< 
> 
g 
2, 

> > 

si 

2. 

232217 >c 


HVFUlHhllLAL IU5XKD — 
PROTEIN IN UME3-PUB 1 
INTERGENIC REGION 
>gi|2131S66|pir||S62935 
lypothetical protein YNL023c - 
'east (Saccharomyces 
erevisiae) 

>gi|1301855|gnl|PID|e239870 
Z71299) ORE YNL023c 
Saccharomvces cerevisiae] 

RANSFERASE GST-6.0 
3STBI-1) 

gi|42119S|pir||S29772 
utathione transferase (EC 
5.1.18) - Proteus rnirabilis 
zi|2126142|pir||S71882 
utathione transferase (EC 
5.1.18) B - Proteus rnirabilis 
i| 1053076 (U38482) 


6.7 I 
5.2 1 
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SEQ 
ID 



700 



Nearest Neighbor rBlastN vs. Genbank) 



ACCESSION 



695 I M63650 



696 I Y13298 



697 1 X56600 



698 I Z23107 



699 | M2067O 



DESCRIPTION 



Mouse M-twist gene 
mRNA. complete cds 



Z62997 



Homo sapiens GDP 
dissociation inhibitor 
beta pseudoeene 

RatSOD-2 gene for 
manganese- 
containing superoxide 
dismutase 



M.musculus mRNA 
for 5HTx serotonin 

receptor 

Plasmodium viva* 
circumsporozoite 
rotein gene, 3* end 



701 I U95094 



702 1 U9509S 



703 | U95094 



H.sapiens CpG DNA 
clone 76g 1 i , reverse 
read cpg76gl l.rtla 



Xenopus laevis XL 
INCENP(XL- 
INCENP) mRNA. 
complete cds 



Xenopus laevis 
mitotic 

phosphoprotein 44 
mRNA. partial cds 



Xenopus laevis XL- 
INCENP(XL- 
INCENP) raRNA. 
complete cds 



P VALUE 



Nearest Neighbor fBlascX vs. Non-Redundant Pmr.~ 



ACCESSION 



0.063 



0.063 



0.063 



0.063 



0.063 



0.063 



0.063 



0.063 



0.063 



DESCRIPTION 



1730141 



1085930 



3882143 



1708162 



4033395 



HUULfcXMfcNI'AL 

RETARDATION SYNDROME | 
RELATED PROTEIN 2 

gi|2135l29|pirj|S60I73 fragile 
X mental retardation syndrome 
related protein - human 
>gi| 1098637 (U3150I) fragile x| 
mental retardation syndrome 
related protein [Homo sapiensl I 1.8 



hypothetical protein 4 - fowl 
adeno virus 1 I 1.3 



(AB01SZ54) KIAA07I1 protein 

Homo sapiensl q^q 

HUNTINGTIN 
(HUNTINGTON'S DISEASE 
PROTEIN HOMOLOG) (HD 
PROTEIN ) | 045 



1350911 



298 1200 



3877951 



33930 IS 



DNA GYRASE SUB UNIT B 

s ubunit [iVIyxococcus xanthus] 
Rt J iiNuiC ALU) RECHF 

RXR-BETA sapiens] 
>gi|3l72493 (AF065396) 
retinoic X receptor B 
dJ1033B 10.11 (Retinoid X 
receptor beta (RXRB)) [Homo 
sapiens] >gi|4249766 
(AF120161) retinoic X receptor 
beta 



0.35 



0.16 



(AF04S732) cyclin T2b [Homo 
sapiensl 



0.090 



(ZS1555) predicted using 
Genefmder 



6e-07 



(AL031174) hypothetical 
protein 



2e-10 



1^1 7 



WO 01/02568 
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p 


1 Nearcs 
ACCESSIOI 


t Neighbor f BlastN vs. 

V DESCRIPTION 
E.coli genomic ON A 


Genbank) 
P VALUE 


Nearest Nei^ 
> ACCESSION 


ibor f BlastX vs. Non-Redundant ] 
DESCRIPTION 


foteins) 
P VALUE 


704 J D90872 


Kohara clone 
#419(54.7-55.1 min. 


) 0.063 


2498198 


CYTOCHROME B561 
(CYTOCHROME B-561) 


3e-19 


705 1 M25528 

706 j U45256 


M.crystailinum 
ferredoxin-NADP+ 
reductase (fnrA) 
mRNA. complete cds 

Strongyioides ratti 
microsatellite B DNA 


0.062 
0.062 


_ <NONE> 
<NONE> 


<NONE> 

<NONE> 


<NONE> 
<NONE> 


707 1 U95102 

708 1 AF044317 

1 709 J 273975 


Xenopus Iaevis 
mitotic 

phosphoprotein 90 
mRNA, complete cds 

Homo sapiens 
TEUAMLI fusion 
gene, partial sequence 
(Jaenorhabditis 
elegans cosmid 
T06ES, complete 
sequence 
[Caenorhabditis 
elegans] 


0.062 
0.062 

0.062 


<NONE> 
<NONE> 

3108187 


(AC004663) Notch 3 [Homo 
saDien<*1 


<NONE> 
<NONE> 

2.9 


1 1 

I 

710 1 X54232 [ 


Human mRNA for 
leparan sulfate 
)roteag!ycan 


0.062 


I 

1076741 f 


:hmnase(EC 3.2.1.14) 
srecursor - rice precursor - rice 
>gi|S07955 (X87109) chitinase 
Oryza sativa] 


0.59 


I I 

1 f 

711 I X0307T c 


Bovine retinal mRNA 
or transducin beta- 
ubunit 


0.062 


s 

477578 > 


ialidase - Actinomyces viscosus 
gi|141S52 


0.087 


I 

1 712 Y12573 a 


).me!anogaster Jun 
nd 14-3-3 zeta sene 


0.062 


3879551 C 


£70756) similar to collagen 


0.073 


B 

I n 

1 713 1 L26573 p! 


ombus terrestris 
litochondrial 
ytochrome oxidase I, 
inial cds. 


0.062 


0 
d 

1684959 ia 


J20600) NADH 
shydrogenase subunit [Vanda 
mellaca] 


0.039 
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1 Nearest Neighbor fBlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












AiVllNUPbPllJJASbB - 




714 


U58994 


Human ladinin (LAD) 
gene, complete cds 


0.062 


2811078 


(ARG1NYL 

AMINOPEPTEDASE) 

(ARGININE 

AMINOPEPTIDASE) 

(CYTOSOL 

AMINOPEPTIDASE IV) (AP- 
B)>gi|2039143 (U61696) 
aminopeptidase B [Rattus 
norvegicus] 


9e-06 


715 


AB014553 


Homo sapiens mRNA 
for KIAA0653 
protein, panial cds 


0.062" 


1326350 


(U58748) similar to potential 
transmembrane domains in S. 
cerevisiae nulcear division 
RFTl protein (SP:P38206) 


5c 10 


716 


L16898 


Mus musculus 
collagen alpha 1 type 
XVIII mRNA. Send. 


0.062 


1723657 


HYFUlHhllLAL J8.iRU 
PROTEIN IN ERV1-GLS2 
INTERGENIC REGION 
>3ii2132587|pir||S64322 
probable membrane protein 
YGR03 1 vv - yeast 
(Saccharomyces cerevisiae) 
>"ill323010l°nlIPIDle' , 43277 
(Z72SL6) ORF YGR031w 
Saccharomyces cerevisiae] 


le-14 


717 


X99343 


M.tuberculosis 
guaA/B & choD 
genes 


0.062- 


3873807 


(Z49907) B0491.1 
[Caenorhabditis elegansl 


2e-l9 


718 


AF010193 


Homo sapiens MAD- 
related gene SMAD7 
(SMAD7) mRNA. 
complete cds 


0.061 


<NONE> 


<NONE> 


<NONE> 


719 


LI0182 


Myrmeleon sp. 18S 
ribosomal RNA. 


0.061 


<NONE> 


<NONE> 


<NONE> 


720 


Y 14723 


Choanomphalus 
incertus 
mitochondrial 
cytochrome c oxidase 
subunit I gene, partial 


0.061 


<NONE> 


<NONE> 


<NONE> 


721 


L27840 


bovine respiratory 
syncytial virus 
nucleoprotein mRNA. 
:ompletc cds. 


0.061 


542955 


nicleoporin p62 - human 


8.6 



WO 01/02568 
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SEQ 
ID 



Nearest Neighbor (BlastN vs. Genbank) 



accession! description 



P VALUE 



Nearest Neighbor fBlastX vs. Non-Redundant Pmtftin«i 



722 



jXenopus laevis XL- 
ImCENP (XL- 
INCENP) mRNA, 
_ U95094 complete cds 



723 I U95098 



724 | U26463 



[Xenopus laevis 
(mitotic 

Iphosphoprotein 44 
ImRNA, partial cd s 
Sporidio bolus 
Isalmonicolor 
NADPH-dependent 
aldehyde reductase 
gene, complete cds 



ACCESSION 



0.061 



0.061 



0.061 



DESCRIPTION 



494454 



725 I AF035443 



[Xenopus laevis wee I 
Ihomolog mRNA. 

[complete cds 

KJaenorhabditis 



726 I 248584 



lelegans cosmid 
ZK1321, complete 
sequence 
[[Caenorhabditis 
lelegans] 



0.061 



0.061 



3 1 8349 1 



pus scrora 

■ Ui|4y44DD|pdb|lPUS|H SB — 
scrofa Sus scrofa 
>gi|14212I0|pdb|lPCP| Porcine 
Spasmolytic Protein (Psp) (Nmr, 
19 Structures) Spasmolytic 

[Polypeptide 

>gi|I633061[pdb)2PSP|B Chain 
B, Porcine Pancreatic 
Spasmolytic Polypeptide 



(AE0OI417) hypothetical 
protein [Plasmodium 
J845272 [falciparum! 



(U79302) unknown [Homo 
17102 88 sapiens) 

EMBL:D33048 comes from this 
gene; cDNA EST 
|EMBL:D35780 comes from this 
gene; cDNA EST yk442c6.3 
koines from this gene; cDNA 
!ESTyk442c6.5 comes from this 
gene; cDNA EST yk398f6.3 
[comes from this gene; cDNA 
E... 

>gi|3979S16|gnI|PID|el3583 15 
|EST EMBL:D35780 comes 
from this gene; cDNA EST 
yk442c6.3 comes from this 
Igene; cDNA ESTyk442c6.5 
[comes from this gene; cDNA 
ESTyk39Sf6.3 comes from this 
3979720 [gene: cDNA E... 

hvpuiHhTKjALSisiai — 



P VALUE 



29 



1.3 



0.44 



PROTEIN C27F2.7 IN 
CHROMOSOME III 
>gi| 10655 10 (U40419) C27F2.7 
gene product [Caenorhabditis 

elegans) 



2e-04 



3c-Il 
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Nearest Neighbor (BlasiN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSKW 


f DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












HYPOTHETICAL 32.0 KD 




727 


X61489 


Zea mays pep gene 
for (C3 type) 
phosphoenol pyruvate 
carboxylase 


0.061 


2496887 


PROTEIN C09F5.2 IN 
CHROMOSOME 01 
>gi|732538 (U22832) C09F5.2 
gene product [Caenorhabditis 
elegans] 


le-I5 


728 


AF025408 


Drosophila 
melanogaster 
Windbeutel (wind) 
gene, complete cds 


0.061 


3702295 


(AC0057S3) R33083.1 [Homo 
sapiens] 


2c-60 


729 


AB012106 


Brassica rapa mRNA 
for SRK45, complete 
cds 


0.060 


<NONE> 


<NONE> 


<NONE> 


730 


AF027174 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
B) mRNA, complete 
cds 


0.060 








731 


Y08682 


H.sapicns mRNA for 
carnitine 

palmitoyltransferase I 

type I 


0.060 


3319446 


(AF077541) contains similarity 
to class-I aminoacyl-tRNA 

elegans] 


8.1 


732 


U95094 


Xenopus laevis XL- 
INCENP (XL- 
INCENP) mRNA. 
complete cds 


0.060 


1041119 


(D78016) TRAE [Enterococcus 
faecalis] 


8.1 


733 


AF064030 


Helianthus tuberosus 
lectin 2 mRNA, 
complete cds 


0.060 


632209 


T-lymphotropic virus PTLV-L 
(fragment) 


3.7 


734 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


0.060 


3098348 


(AF037401) neuropeptide 
Danio rerio] 


2.1 


735 


U95102 


Xenopus laevis 
mitotic 

phosphoprotein 90 
mRNA, complete cds 


0.060 


12:5978 


LARPROTKIN PRECURSOR 
(LEUKOCYTE ANTIGEN 
RELATED) 

>gi|70146|pir||TDHULK 
eukocyte antigen- related 
protein precursor * human 
>gi[34267 sapiens] 


1.2 


736 


1 

U76523 < 


Sambucus nigra lectin 
precursor mRNA. 
:omplete cds 


0.060 


i 

2055394 


U87306) transmembrane 
receptor UNC5H2 [Rattus 
norvegicus] 


0.32 


737 


] 
c 

U69668 [ 


Human nuclear pore 
;omplex-assoc iated 
protein TPR 


0.060 


( 

4127854 ; 


Yl4063)ChTl thymocyte 
miigen [Gallus gallus] 


9e-04 



7~U 
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Nearest Neiehbor (BlastN vs. Genbank) 


Nearest Neiehbor f BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












(U58748) similar to potential 




738 


AB014553 


Homo sapiens.mRNA 
for KIAA0653 
protein, partial cds 


0.060 


1326350 


transmembrane domains in S. 
cerevisiae nulcear division 
RFTI protein (SP:P38206) 


le-09 


739 


U95098 


Xenopus laevis 
mitotic 

phosphoprotein 44 
mRNA, partial cds 


0.060 


2632098 


(Y 155 13) Prodos protein 
[Drosophila melanogaster] 


5e-10 


740 


Z96260 


H.sapiens telomeric 
DNA sequence, clone 
L2QTEL 101, read 
12QTELOO10Lseq 


0.059 


<NONE> 


<NONE> 


<NONE> 


741 


M93128 


Mouse homeobox 
protein (EVX2) 
mRNA, complete cds. 


0.059 


<NONE> 


<NONE> 


<NONE> 


742 


AB012106 


Brassica rapa mRNA 
for SRK45, complete 
cds 


0.059 


1652318 


(D90904) lysostaphin 
[Synechocystis sp.] 


4.7 


743 


AB007920 


Homo sapiens mRNA 
for KIAA045 1 
protein, complete cds 


0.059 


479491 


transcription factor brn-3b - 
human 


0.71 


744 


M60445 


Human histidine 
decarboxylase (HDC) 
mRNA. complete cds 


0.058 


<NONE> 


<NONE> 


<NONE> 


745 


U01836 


Ustilago maydis 
exodeoxyribon ucleas 
e(RECl)gene, 
complete cds. 


0.058 


1171908 


ULlUUffcr'lJJJh 
TRANSPORT SYSTEM 
PERMEASE PROTEIN OPPC 
>gi|1075086|pir||D64184 
oligopeptide transport system 
permease protein (oppC)C 
homolog - Haemophilus 
influenzae (strain Rd KW20) 
permease protein (oppC) 
Haemophilus influenzae Rd] 


1.5 


746 


AF090U5 


Lycopersicon • 
esculentum cytosolic 
class II small heat 
shock protein HCT2 
(HSP17.4) mRNA. 
complete cds 


0.053 


3193265 


(AF069131) chitinase [Bacillus 
subtilis] 


0.002 


747 


AB012105 


Brassica rapa mRNA 
for SLG45, complete 
cds 


0.057 


433385 


(U03978) dynein heavy chain 
isotype 7A [Tripncustes sratilla] 


3.4 
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1 Nearest Neiehbor (BiastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Protein^ 


SEQ 
ID 


ACCESSION 


f DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Arabidopsis thaliana 










748 


AJ005813 


mRNAfor 
neoxanthin cleavage 
enzvme 


0.056 


<NONE> 


<NONE> 


<NONE> 


749 


Y 16828 


Lagopus lagopus 
genomic 
microsatellite 
sequence, LLST4 


0.O56 


3323678 


(AE001299) hypothetical 
protein [Chlamydia trachomatis] 


4.3 


750 


AF0I2899 


Sambucus nigra 
ribosome inactivating 

nrnfpin nr*»eitr^nr 

mRNA. complete cds 


0.055 


<NONE> 


<NONE> 


<NONE> 


751 


AF074385 


Sambucus nigra 
mRNA, complete cds 


0.055 


137339 


69 KD PROTEIN 
>gi|94375|pir||S 19150 
hypothetical protein, 69K - 
turnip yellow mosaic virus 


0.69 


752 


U76523 


Sambucus nigra lectin 
precursor mRNA, 
complete cds 


0.035 


<NONE> 


<NONE> 


<NONE> 


753 


M92069 


Human retrovirus- like 
sequencc-isoleucine c 


0.034 


<NONE> 


<NONE> 


<NONE> 


754 


S78516 


GlL=ankyrin-Iike 
repeat [orf virus OV, 
N22. Genomic. 1608 
ntl 


0.033 


2804465 


(AF043700) contains similarity 
to human RNA-binding protein 
FUS/TLS (SW:Q2S009) 
Cucnorhnhditi^ Hetmn^t 


0.15 


755 


M 15646 


Chicken myosin 
alkali light chain 
mRNA, complete cds, 
clone pFI. 


0^027 


■ 3334221 


4- 

HYDROXYPHENYLPYRUVA 
TE D [OXYGENASE 4- 
hydroxyphenylpyruvate 
^oxygenase [Mycosphaerella 
sraminicola] 


6e-i7 


756 


< 
] 

AF027I74 c 


Arabidopsis thaliuna 
:eIIulose synthase 
:atalytic subunit (Ath- 
3) mRNA. complete 
:ds 


0.025 


< 

3877815 |( 


296048) predicted using 
jenefinder 


5.0 
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mm 


Nearest Neighbor (BlasiN vs. Genbank) 


Nearest Neighbor (BlasiX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















757 


AJ002291 


Streptococcus 
pneumoniae pbplb 
gene, complete 


0.025 


3880487 


UOiSU I4j similar to noose- 
phosphate pyrophosphokinase; 
cDNA EST EMBL:D73173 
comes from this gene; cDNA 
EST EMBL:D70909 comes 
from this gene; cDNA EST 
EMBL:D73449 comes from this 
gene; cDNA EST 
EMBL:D76167 comes from this 
ge... 


1.7 


758 


X79104 


Cbotulinum (NCTC 
7272 typeA)HA-33 
and P-2 1 genes. 


0.024 


2648615 


(AE000970) tungsten 
formy Imethanofuran 
dehydrogenase, subunit B (fwdB 
2} [Archaeoglobus fulgidus] 


6.1 


759 


U95I02 


Xenopus laevis 
mitotic 

phosphoprotein 90 
mRNA. complete cds 


0.024 


1663698 


(DS3785) expressed 
ubiquitously; product similar to 
D.melanogaster mam protein. 
Homo sapiens] 


4.7 


760 


U36197 


Chlamydomonas 
reinhardtii cobalamin- 
independent 
methionine synthase 
mRNA, complete cds 


0.024 


585723 


PEROXISOME 
PROLEFERATOR 
ACTIVATED RECEPTOR 
GAMMA (PPAR-GAMMA) 
>gi|2S38lS|pir||C422I4 
peroxisome proliferator- 
activated receptor gamma chain - 
African clawed frog >gi|2 14668 
(MS4163) peroxisome 
pro I iterator activated receptor 
gamma [Xenopus laevis] 


0.42 


761 


L38865 


Macaca mulatta 
(clone MMVA63) T- 
cell receptor alpha 
(TCR A) mRNA. 
partial cds. 


0.023 


<NONE> 


<NONE> 


<NONE> 


762 


AF035948 


Vtus m use u I us insulin 
receptor substrate-3 


0.023 


2500587 


SPLICEOSOME 
ASSOCIATED PROTEIN 49 
spliceosome-associated protein 
SAP-49- human >gi|556217 


0.40 


763 


X98890 


S.tuberosum mRNA 
for inorganic 
phosphate 
transporter, StPTl 


0.023 


110072 


)ro!ine-rieh protein MP4 - 
mouse >gi|53182 


0.18 
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SEQ 
ID 



Nearest Neighbor fBlastN vs. Cenbank) 



ACCESSION 



764 1 X91212 



765 | AC004498 



766 | UQ7Q83 



DESCRIPTION I p VALUE 



L.esculentum mRNA 
for HP-ZIP protein 



'Homo 



sapiens 
chromosome 5. PI 
clone I209C1 (LfiNL 
H104), complete 
sequence [Homo 
sapiens] 



767 1 X98890 



Human prostatic acid 
phosphatase (ACPP) 

gcrie. exon 1 

S.tuberosum mRNA 
for inorganic 
phosphate 
transporter. StPTl 



768 | X56488 



esculentum LAT59 
gene 5'flanking 
region, expressed 
during pollen 
maturation 



0.022 



0.022 



0.022 



0.022 



Nearest Neighbor (BlastX vs. Non-Redundant Protein s 1 
ACCESS ION | DESCRIPTION iPVALUEl 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



0.022 



<NONE> 



<NONE> 



I <NONE>| 



[<NONE> 



<NONE> 



769 I M34651 

770 1 X66727 



Pseudorabies virus 
with upstream and 
downsteam 
sequences. 
P.taeda gene tor 
protochlorophyllide 
reductase 



.771 U95102 



Xenopus laevis 
mitotic 

phosphoprotein 90 
mRNA. complete cds 



772 1 U95098 



Xenopus laevis 
mitotic 

phosphoprotein 44 
mRNA. partial cds 



0.022 



0.02: 



<NONE> 



3878517 



<NONE> 1 <NONE> I 

1(292806) K10G4.4 
[Caenorhabditis elegans] | 4.3 



0.022 



1854452 



[( D8950 1 ) similar to salivary 
proline-rich protein P-B [Homo 
Jsapiens] j 43 



0.022 



3021699 



773 I X71932 



774 



X87369 



H.sapiens XB gene 
for tenascin-X. in iron 
14 



C.perfringens nanH 
gene&ORF1.2.3&4 



(AB005298) BAI 2 [Homo 
sapiens] 



0.022 



liver stage antigen LSA-1 - 
Plasmodium falciparum 
>gi|99l6(X56203) liver stage 
627059 [antigen 



0.64 



0.058 



0.022 



(U78975) poIy(ADP-ribose) 
2062407 Iglycohvdrolase [Bos taurus] 1 0,056 



>V5 
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SEQ 
ID 



775 



Nearest Neighbor (BlastN vs. Genbank) 



ACCESSION 



DESCRIPTION 



P VALUE 



Nearest Neigh bor f BlastX vs. Non-Redundant Proteins) 



ACCESSION 



Y 1497 I 



776 I AF003133 



777 I.AFQ03133 



GaJlus gal I us mRNA 
for K60 protein 



0.022 



Caenorhabditis 
elegans cosmid 
T21E3 



Caenorhabditis 
elegans cosmid 
T21E3 



779 | U67570 



780 1 L01584 



78 i 1 L04787 



782 | U95094 



783 



L36890 



■luman helix- loop- 
helix proteins Id- 1 
(KM) and Id-i'(ID- 
1) genes, complete 

cds 

Methanococcus 
jannaschii section 112 
of 150 of the 
complete genome 



0.022 



0.022 



DESCRIPTION 



134091 



1709997 



1709997 



Trypanosoma cruzi 
calcium-binding 
protein (CUB2.8) 
gene, complete cds. 



Borrelia hermsii outer 
membrane lipoprotein 
Xenopus laevis XL 
INCENP(XL- 
INCENP) mRNA, 
complete cds 



0.021 



0.021 



0.021 



Saccharomyces 
ccrevisiae 
mitochondrion 
transfer RNA-Thrl 
(tRNA-Thr) gene; 
transfer RNA-Val 
(tRNA-Val) gene; 
xi2 gene, complete 
cds;ORF2 and origin 
of replication {or\5). 



0.021 



0.021 



<NONE> 



<NONE> 



<NONE> 



0.021 



<NONE> 



<NONE> 



<NONE> 



Ui SMALL NbLLbAk 



IP value! 



KD(Ui SNRNP70 KD) 
>gi|85864|pir||S02016Ul 
snRNP 70K protein - African 
clawed frog >gi|65179 
(X12430) Ul 70K [Xenopus 
lac vis] 



DNA REPAIR PROTEIN 
RAD18>gi|1150622 protein 
rad!8 [Schizosaccharomyces 
pombe] 



0.032 



DNA REPAIR PROTEIN 
RADI8 >gi|I 150622 protein 
radl8 [Schizosaccharomyces 
pombe] 



2e-08 



2e-08 



<NONE> 



<NONE> 



1<NQNE>1 



I <NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



| <NQNE>1 



<NONE> 



<NONE> 



<NONE> 
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0 Nearest Neiehbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSIOf^ 


I DESCRIPTION 


P VALUE 


AfTF^TOM 




P VALUE 
















784 




Homo sapiens biliary 
glycoprotein (BGP) 
gene, partial cds. 




<NUNfc> 


<NONE> 


<NONE> 


785 


M87504 


Tetrahymena 
thermophiJa histone 
H3 (HHT2) gene, 
complece cds 


0.021 


<NONE> 


<NONE> 


<NONE> 


786 


U94346 


Human calpain-Iike 
protease ^nira-^/ 
mRNA. complete cds 


0.021 


•' <NONE> 


<NONE> 


<NONE> 


787 


LOI584 


Trypanosoma cruzi 

/*llr*t i im_hi nrl i n rr 

Ctll L i um-u in u i n & 

protein (CUB2.S) 
gene, complete cds. 


0.021 


<NONE> 


<NONE> 


<NONE> 


788 


U36530 


Pongo pygmaeus C 1 
microsatellite, clone 
#1, from the tandemly 
repeated genes 
encoGinz u small 
nuclear RNA (RNU2 
locus) 


0.021 


<NONE> 


<NONE> 


<NONE> 


78Q 




Human gene for 
imerleukin 1 alpha 
(i-L- 1 aipna) 


U.UJ 1 


416974 


EARLY TRANSCRIPTION 
FACTOR 70 KD SUB UNIT 


8.9 


790 


U20806 


Dictyostelium 
discoideum guanine 
nucleotide-binding 
protein alpha subunit 
5 (G alpha 5) gene, 
complete cds. 


0.021 


1401211 


(U58510)RNA helicase 
homolog [Chlorarachnion 
CCMP621] 


8.8 


791 


259258 


H.sapiens CpG DNA. 
clone 13d2, reverse 
read cpe!3d2.rtlc . 


0.021 


3121732 


ACONITATE HYDRATASE 
(CITRATE HYDRO-LYASE) 

(AF002I33) aconitase 
Mycobacterium avium] 


7.0 


792 


1 

AF030692 ( 


Plasmodium 
: alciparum strain 7GS 
:hloroquine 
resistance candidate 
protein (cg2) gene. 
:omp!ete cds 


0.021 


t 

3024190 t 


NINf- ftROKlN 
>gi|212025l|pir||S66581 
lypothetical protein 56 - phage 
32 >gi|1051114(X92588) 
)rf56; related to nin60 (ninE) of 
jacteriophase lambda 


5.8 


793 


i 

J 
c 

U67570 c 


Vlethanococeus 
annaschii section 1 12 
if 150 of the 
■omplete genome 


0.021 


( 

2341037 f 


AC000104)F19P19.17 
Arabidopsis thaliana] 


4.0 
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SUES 


Nearest Neighbor (BlascN vs. Gen bank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












NUCLEAR FACTOR NF- 




794 


D86566 


XlUITlan L/lN/\ 1UI 

NOTCH4, partial cds 


0.021 


1708619 


KAPPA-B P100 SUBUNTT 
(H2TF1) (ONCOGENE LYT- 
10) (LYT10) [CONTAINS: 

MT TpT CAP CAfTHD MT7 

KAPPA-B P52 SUBUNTT] 


3.1 


795 


LI 1648 


Strcptomyces 

factor (rpoX) gene, 
complete cds. 


0.021 


79833 


iijf pvuieiiLui i iy.jn. piuicin 

(uvrA region) - Micrococcus 
luteus 


1.8 


796 


U95094 


Xenopus laevis XL- 

INCENP) mRNA. 
complete cds 


0.021 


128000 


NEUROIiNDOCRLHE 

CONVERTASF 1 

PRECURSOR (NEC 1) (PCI) 

(PROHORMONE 

uuin vcK i Aoc I) propepnae 

processing protease [Mus 

cookii] 


1.0 


797 


U30938 


Rattus norvegicus 

mlcrOtuOuic- 

associated protein 2 


0.021 


468600 


(.A/^t-f ucia-j iniegnn 
fTakifugu rubripes] 


1.0 


798 




Chicken mRNA for 
TSC-22 variant, 
complete cds, clone 


\J.\JJ. 1 




27 kda amelogenin 
( alternatively spliced} 


V.Ol 


799 


U4004I 


Gailus gallus eHAND 
mRNA. comptete cds 


0.021 


3449308 


(ABO 11541) MEGF8 [Homo 
sapiensl 


0.21 


800 


X71932 


O.^iipjCild i\D i-tllt 

for tenascin*X, intron 
14 


0.021 


627059 


liver stage antigen LSA-1 - 

rldMIlUUlUIll lulL-luili Ulll 

>gi|99I6 (X56203) liver stage 
antiaen 


0.054 


801 


AF042333 


Oryza sativa 24- 
methylene lophenol 
C24(l)methyltransfer 
ase mRNA, complete 
cds 


0.021 


854065 


(X83413) U88 [Human 
herpesvirus 6] 


0.014 


802 


L37380 


Rat apical endosomal 
glycoprotein mRNA. 
complete cds. 


0.021 


3334377 


TRANSMEMBRANE 
PROTEASE, SERINE 2 


le-05 


803 


AF003133 


Caenorhabditis 
elegans cosmid 

t:Ie3 


0.021 


1709997 


DNA REPAIR PROTEIN 
RAD 1 8 >gi| 11 50622 protein 
radlS [Schizosaccharomyces 
pombe) 


3e-0S 



1J& 



WO 01/02568 



PCTYUS00/18374 



Nearest Neighbor (BlastN vs. Gcnbank) 



SEQ 

np I ACCESSION 



804 I X57689 



805 I U95102 



806 1 X77753 



DESCRIPTION 



P VALUE 



Rabbit raRNA tor 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



calcium channel BI-2 
(lambda CBP 109 and 
CB101) 



Xenopus laevis 
mitotic 

phosphoprotein 90 
mRNA. complete cds 



0.021 



0.021 



H.sapiens TROP-2 
gene 



0.021 



DESCRIPTION 



2959370 



1109830 



1723657 



(AL022117) hypotheticaJ 
protein 



(Ui534) coded for by IT 
elegans cDNA CEESI42F; 
Similar to helicases of 
SNF2/RAD54 family. 
[Caenorhabditis elegans] 

UVUIIIULIU'U Hu l' i j 



HiPUIHLULAL 58.5 'KIT 
PROTEIN IN ERV1-GLS2 
INTERGENIC REGION 
>gipl32587|pir||S64322 
probable membrane protein 
YGR031w- yeast 
(Saccharomyces cerevisiae) 
>gi| 1 3230 1 0|gnl|PIDie243277 
(272816) ORF YGR031w 
fSaccharomyces cerevisiae] 



P VALUE 



le-10 



5e-U 



807J_X98890 



.tuberosum mRNA 
for inorganic 
phosphate 
transporter, StPTl 



0.021 



2137872 



zinc finger protein PZF - mouse 
>gi|453376 



,808 I AF027173 



Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
A) mRNA, complete 
cds 



809 I AJ224935 



810 I U76524 



811 I X99941 



Homo sapiens 
Promotor Region and 
PCK2aene " 



Sambucus nigra 
ribosomc inactivating 
protein precursor 
mRNA, complete cds 



0.020 



0.020 



0.020 



A.thaliana GBF1 
gene 



0.020 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



_812 1 X65138 



M.musculus mRNA 
for tyrosine kinase > 
:gb|S57168|S57l6S 
Sek=Eph-reIaied 
receptor protein 
tyrosine kinase [mice. 
mRNA, 4242 nt] 



0.020 



<NONE> 



<NONE> 



<NONE> 
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SEQ 
ID 



Nearest Neighbor (BlastN vs. Genbank) 



ACCESSION 



DESCRIPTION 



Nearest Neighbor (BlastX vs. Non-Redundant ProteinO 



P VALUE | ACCESSION 



DESCRIPTION 



P VALUE 



813 | L04787 



Borreiia hermsii outer 
membrane lipoprotein 



0.020 



<NONE> 



<NONE> 



814 1 AJ223633 



tnterococcus faecium 
genes encoding 
enterocin L50A and 
enterocin L50B plus 

and 3' flanking 
reaions 



815 1 AB012106 



Brassica rapa mRNA 
for SRK45, complete 
cds 



0.020 



<NONE> 



<NONE> 



<NONEs 



0.020 



*<NONE> 



<NONE> 



<NONE> 



816 | AE001539 



Helicobacter pylori, 
strain J99 section 100 
of 132 of the 
complete genome 



0.020 



172292 



817 | AF074386 



Sambucus nigra 
hevein-like protein 
mRNA, complete cds 



0.020 



94173 



(LI 1895) transmembrane 
protein [Saccharomyces 



cerevisiae 



8.4 



pol polyprotein - Chinese 
hamster intracistemal A-particle 
CHIAP34 



8.0 



818 1 M55264 



Herpesvirus saimiri 
dihydrofolate 
reductase (DHFR) 
and snRNA (HSUR) 
enes. complete cds. 



0.020 



2924250 



(Z98745) dJ29K1.2 [Homo 
sapiens) 



6.5 



819 | AF052163 



820 | AF074387 



Homo sapiens clone 
24456 mRNA 
sequence 



0.020 



1706288 



U{4J UUFAMiWh KhL-bPlUK 
(D(2C) DOPAMINE 
RECEPTOR) 

>gi|21194S2|pir||I49246D4 
dopamine receptor - mouse 
>gi|758427 (U19880) D4 
dopamine receptor [Mus 
musculusj 

>gi| 1095539|prf]|2 109259A 
dopamine D4 receptor [Mus 
musculusj 



Sambucus nigra 
hevein-like protein 
mRNA. complete cds 



0.020 



2113798 



(2S3259) AmphiBrf38 
[Branch iostoma floridae] 



4.9 



4.7 



821 I AF052163 



Homo sapiens clone 
24456 mRNA 
sequence 



0.020 



lZtWb4)el>NA EST 
EMBL:T02354 comes from this 
gene; cDNA EST 
EMBL:D32698 comes from this 
gene; cDNA EST 
EMBL:D354 1 1 comes from this 



3874733 



4.7 



WO 01/02568 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ED 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















822 


LI1002 


Rat ankyrin binding 
glycoprotein- 1 related 
mRNA sequence. 


0.020 


552132 


(K01664) Bkm-!ike protein 
[Drosophila melanogaster] 


3.8 


823 


AE001539 


Helicobacter pylori, 
strain J99 section 100 
of 132 of the 
complete genome 


0.020 


172292 


(L 1 1895) transmembrane 
protein [Saccharomyces 
cerevisiae] 


3.8 


824 


X98890 


S. tuberosum mRNA 
for inorganic 
phosphate 
transporter, StPTI 


0.020 


3879798 


Domain (2 domains); cDNA 
EST yk390bl0.3 comes from 
this gene; cDNA EST 
EMBL:D71652 comes from this 
gene;cDNA ESTyk275f8.3 
comes from this gene; cDNA 
EST yk393b9.3 comes from this 
gene; cDNA ESTyk37... 
>gi|3880220|gnl|PID|el349842 
yk390bl0.3 comes from this 
gene; cDNA EST 
EMBL:D71652 comes from this 
gene; cDNA EST yk275fS.3 
comes from this gene; cDNA 
EST yk393b9.3 comes from this 
2ene;cDNA EST yk37... 


1.3 


825 


U97519 


Homo sapiens 
podocalyxin-like 
protein mRNA, 
complete cds 


0.020 


1345633 


1 - 1 - 1 b 1 K AH Y UKUb UL A I fc 
SYNTHASE, CYTOPLASMIC 
(Cl-THF SYNTHASE) 
(METHYLENE TETRAHYDR 
OFOLATE 

DEHYDROGENASE/ 
METHENYLTETRAHYDROF 
OLATE CYCLOHYDROLASE 
Cl-tetrahydro folate synthase 
[Rattus norveaicus) 


0.066 


826 


AF003133 


Caenorhabditis 
elegans cosmid 
T21E3 


0.020 


1709997 


DNA REPAIR PROTEIN 
RAD 1 8 >gi| 1 150622 protein 
radlS [Schizosaccharomyces 
pom be) 


2e-07 


827 


U32857 


Saccharomyces 
cerevisiae VARt 
gene, mitochondrial 
gene encoding 
mitochondrial 
protein, 3' processing 
site, partial sequence 


0.019 


<NONE> 


<NONE> 


<NONE> 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


f DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












NbUUOGLNIC LOCUS 




823 


AF027174 


Arabidopsis thaliana 
cellulose synthase 

B) mRNA. complete 
cds 


0.019 


2506381 


HotcHttOMOLOG 

PROTEIN 4 PRECURSOR 
(TRANSFORMING PROTEIN 
iiN i -3) mammary gene mRNA, 
complete cds.], gene product 
[Mus musculus] 


3.3 


829 


AF034099 


Laccaria bicolor 
glyoxal malate 

cunfhtco nrrtf>>in 

»ynujtidC pruicin 
mRNA. complete cds 


0.019 


3880930 


nVLLT2 1 4X n si m 1 1 artn 

Phosphoglucomutase and 
phosphomannomutase 
phosphoserine; cDNA EST 
ciVLOL,.i^joioo comes rrom mis 
gene; cDNA EST 
EMBL:D70697 comes from this 
gene; cDNA EST yk373h9.5 
comes from this gene; cDNA 
EST EMBL:T008... 


6e-15 


830 


AF100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


0.018 


<NONE> 


<NONE> 


<NONE> 


831 


U24578 


Human RP1 and 
complement C4B 
precursor (C4B) 
genes, panial cds. 


0.013 


478673 


proline-rich protein precursor - 
kidnev bean vulgaris] 


3.1 


832 


U76523 


Sambucus nigra lectin 
complete cds 


0.011 


<NONE> 


<NONE> 


<NONE> 


833 


U57649 


uioenzoruran- 
degrading bacterium 
DPO360 2.3- 
dihydroxybi phenyl 
1 ,2-dioxygenase 
(bphC) gene, 
complete cds and 2- 
iydroxy-6-oxo-6- 
phenyIhexa-2.4- 
dienoic acid 
hydrolase 


0.0 li 


<NONE> 


<NONE> 


<NONE> 


834 


1 

X 15642 i 


Z.mays gene for 

shosphoenotpyruvate 

:arbo.\ylase 


0.011 


<NONE> 


<NONE> 


<NONE> 


835 


( 

X51623 i 


I.elegans collagen 
itnc col- 13 


0.010 


( 

1695686 [ 


D83706) pyruvate carboxylase 
Bacillus stearothermophilus] 


3,1 


836 


I 
I 

U83656 r 


*attus norvegicus NT- 
CB gene, promo tor 
eaion 


0.008 


( 

4240195 f 


AB020660) KIAA0853 protein 
Homo sapiens] 


10.0 
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Nearest 


Neighbor (BlastN vs. Genbank) 


Nearest Neiahbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSKN 


1 DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












POL POLYPROTEIN 




837 


AJ222657 


Homo sapiens gene 
encoding retina- 
specific guanylyl 
cyclase 


0.008 


417704 


(ORFIA/1B) [CONTAINS: 
RNA-DIRECTED RNA 
POLYMERASE ; HELICASE; 
PROTEASE 1 


7.4 


838 


ABO P 106 


Brassica rapa mRNA 
for SRK45. complete 
cds 


0.008 


544024 


" LHLUK1DL LHANWEE 

PROTEIN, SKELETAL 
MUSCLE (CHLORIDE 
CHANNEL PROTEIN 1) (CLC- 
1) human >gi|397143 (Z25587) 
human C1C- 1 muscle chloride 
channel (Homo sapiensj 
>gi|398161 (Z25884) human 
CIC- 1 muscle chloride channel 
[Homo sapiens] 


4.6 


839 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


0.008 


532468 


(U 13643) similar to reverse 
transcriptase; possible 
pseudogene [Caenorhabditis 
elegans] 


3.8 


840 


AF0I2899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


0.008 


4101160 


(AF002589) cytochrome 
oxidase I [Austroftindulus 
limnaeus] 


2.7 


841 


AF074385 


Sambucus nigra 
hevein-like protein 
mRNA. complete cds 


0.008 


1711520 


SRB-S/9 PROTEIN 
>ai| 1334996 


1.6 


842 


U48734 


Human non- muscle 
uipitkL-uL Limn rru\i i r\, 

complete cds 


O.OOS 


2829922 


(AC002291)extcnsm 
[Arabidopsis thaliana) 


0.11 


843 


< 
] 

U66669 ( 


Homo sapiens 3- 
lydroxyisobutyryl- 
:oenzyme A 
ivdrolase mRNA, 
romplete cds 


0.007 


<NONE> 


<NONE> 


<NONE> 


844 


I 
I 
c 

D16492 ( 


blouse mRNA for 
5 1 00 serine protease 
)f Ra-reactive factor 
RaRF). complete cds 1 


0.007 


<NONE> 


<NONE> 


<NONE> 
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'.'•-'"•■l Nearest? 
SEQI 

JP 1 ACCESSION 


Neighbor (BlastN vs. 

DESCRIPTION 
Human 


Uenbank) I Nearest Neig 
P VALUE 1 ACCESSION 


tibor (BlastX vs. Non-Redundam i 
DESCRIPTION 


Proteins) | 
P VALUE 


• 845 I D90923 


immunodeficiency 
virus type 1 proviral 
DNA for envelope 
glycoprotein, partial 
cds. isolate 03S 


0.007 <NONE> 


<NONE> 


<NONE>| 


I Homo sapiens mRN.- 
I forKIAA05l5 
846 J AB01 1087 protein, partial cds 


I 1 

0.007 J <NONE> 


<NONE> 


<NONE>| 


1 Aquifexaeolicus 
1 (section 20 of 109 of 

847 1 AE000688 [the complete genome 
1 IB.bovis WC1.I 

848 1 X63723 mRNA 


0.007 I <NONE> 
0.007 <NONE> 


<NONE> 

. <NONE> 


<NONE>| 
<NONE>| 


1 Sambucus nigra 
1 Ihevein-like protein 

849 J AF074386 mRNA, complete cds 

1 [Human beta globin 
1 Iregion AIu repetitive 

850 J J00097 sequence tvpe T. 


0.007 <NONE> 
0.007 <NONE> 


<NONE> 

<NONE> 


<NONE> I 
<NONE> j 


I Human 
1 [immunodeficiency 
1 virus type 1 pro viral 
I [DNA for envelope 
1 glycoprotein, partial 
851 1 D90923 cds. isolate 03S 


0.007 1 <NONE> 


<NONE> 


<NONE>| 


I Xenopus laevis XL- 
1 INCENP (XL- 
I INCENP) mRNA, 
852 U95094 complete cds " 


0.007 


<NONE> 


<NONE> 


<NONE>| 


1 iT.castaneum 
853 1 X91618 hunchback sene 


0.007 


<NONE> 


<NONE> 




1 [Rat nontranscribed 
J spacer (NTS) 
1 downstream of 2SS 
. 854 1 X03838 IrRNA sene 


0.007 


<NONE> 


<NONE> 


cNONE> 




1 Rattus norwegicus 
| interleukin-2 receptor 
1 alpha chain (CD25) 
855 | M55049 I mRNA. complete cds. 


0.007 


<NONE> 


<NONE> < 


:NONE> 
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A*«| N earest Neighbor (BlasrN vs. Genbank) 

seqI 

m 1 ACCESSION 



357 1 AF027173 



858 | AF027174 



859 1 AF012899 
X95276 



860 



DESCRIPTION 



P VALUE 



H.sapiens CpG DNA. 
clone 9e2, reverse 
read cpg9e2.rtla . 

Arabidopsis thaliana 
ceJluIose synthase 
catalytic subunit CAth 

A) mRNA, complete 
cds 

Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 

B) mRNA, complete 
cds 



Nearest Nei ghbor (BlastX vs. Non-Redundant Proteins) 
ACCE SSION | DESCRIPTION 



0.007 



<NONE> 



0,007 



<NONE> 



0.007 



<NONE> 



Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 1 0.007 

falciparum 
complete gene map or' 
plastid-like DNA 0.007 



861 | U72396 



862 | AF 100694 



Lycopersicon 
esculentum class II 
small heat shock 
protein Le-HSPl7.6 
mRNA. complete cds 



<NONE> 



<NONE> 



863 { AB000383 



864 I DS6566 



865 1 U76524 



Mus musculus 
Pontin52 mRNA. 
complete cds 

Leucania seperata 
nuclear polyhedrosis 
virus DNA forpi3. 
xe, envelope protein. 
complete cds 



0.007 



<NONE> 



0.007 



<NONE> 



0.007 



<NONE> 



Human DNA for 
NOTCH4. partial cds 



Sambucus nisra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 



0.007 



<NONE> 



0.007 



<NONE> 



A-35 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 
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fates 


Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BiastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


f DESCRIPTION 


p VALUE 




DESCRIPTION 


r VALUfc 














866 


AF027173 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath 
A) mRNA, complete 
cds 


0.007 


3047072 


(AF058825) No definition line 
found [Arabidopsis thaliana] 


8.9 


867 


AF027174 


Arabidopsis thaliana 
cellulose synthase 

CuuLiyilC SUDUtlll ^rVUi- 

B) mRNA, complete 
cds 


0.007. 


' ' 975754 


(U29359) SpaO [Salmonella 
enterical 


8.6 


868 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


0.007 


1213557 


(U50199) coded for by C 
elegans cDNA yk89e9.5; coded 
for by C. elegans cDNA cm7g5; 
coded for by C. elegans cDNA 
cml4b9; coded for by C. 
elegans cDNA yk52g5.5; coded 
for by C. elegans cDNA 
yk76e5.5; coded for by C. 
elegans cDNA ykl3 If 1 1.5; c... 


8.4 


869 


ABO 12 106 


Rrn^iim nna mRNA 

for SRK45, complete 
cds 


0.007 


2499568 


ISOASPARTATE(D- 
AS PART ATE) O- 
iMETHYL TRANSFERASE 
(PROTFTN-RFT A- 
ASPARTATE 
METHYLTRANSFERASE) 

(PIMT) (PROTEIN L- 
icha cdad tv7 rr\ 

ASP.ARTYL 

METHYLTRANSFERASE) 
Tie thy I transferase [Drosophila 
meianogaster] >gi| 117 1337 
melanogaster] 


8.3 


870 


AF093268 


Rattus norvegicus 
homer- Ic mRNA, 
complete cds 


0.007 


4092077 


[AF095353) toll-like receptor 4 
mutant [Mus musculus] 


6.2 


871 


1 

AF074386 i 


Sambucus nigra 
tievein-like protein 
mRNA, complete cds 


0.007 


151377 


M80653) tctraheme 
Pseudomonas stutzeri) 


6.2 


872 


] 

L42319 I 


Bos taurus (clone 
Sal3.S) tristetraprolin 


0.007 


2507337 I 


rRANSCRIPTION 
rERMINATION FACTOR 
*HO 


5.5 
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Nearest Neiehbor ( BlastN vs. Genbank) 



SEQ 

ID I ACCESSION 



DESCRIPTION 



P VALUE 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



DESCRIPTION 



P VALUE! 



873 I M59815 



Human complement 
component C4A 
gene, exons 10 
throuzh 41. 



0.007 



3876769 



874 1 X63723 



B.bovis WC1.1 
mRNA 



(^oyoJ7) Similarity to Human 
Prolyl 4-hydroxylase alpha 
subunit (SW:P4HA_HUMAN); 
cDNA EST yk219g!2.5 comes 
from this gene; cDNA EST 
yk319d8.5 comes from this 
gene; cDNA EST ylc339dl 1.5 
comes from this gene; cDNA 
ESTyk371c9.3... 



0.007 



2969893 



(AJ0O1858) human SIM2 
[Homo sapiens] 



5.3 



5.3 



875 | AB009864 



Expression vector 
pME18S-FL3, 
complete sequence 



0.007 



p45 NF-E2 related factor 2 - 
2137618 | mouse musculus] 



876 | U76524 



Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 



0.007 



877 I U95102 



Xcnopus laevis 
mitotic 

phosphoprotein 90 
mRNA. complete cds 



0,007 



2S04497 



(AF043705) contains similarity 
to C2H2-type zinc fingers 



440298 



(L27469) product of alternative 
splicing [Drosophila 
|melanogasterl 



5.1 



5.0 



4.7 



8781X58869 



Chicken mRNA for 
aldehyde 
dehydrogenase 



0.007 



11S5062 



(L75945) flagellar export 
protein fBorrelia burgdorferi] 



4.1 



879 I AF027735 



Nephila clavipes 
minor ampultete silk 
protein MiSpl 
mRNA. panial cds 



0.007 



2394390 



(AF017434) pmi-likegene 
product fMethylobacterium 
extorquens] 



4.0 



880 | AF1Q5228 



Bos taurus tufielin 
mRNA. complete cds 



0.007 



3036S02 



1(AL022373) putative protein 



, putative pn 

H ¥ KJ 1 Hb i 1LAL bv.Z KJJ 

PROTEIN T27F2.1 IN 

CHROMOSOME V 

>gi|3880311|gnl|PID|e 1349855 

BX42 (SW:BX42_DROME); 

cDNA EST EMBL:C07233 

comes from this gene; cDNA 

EST EMBL:C08532 comes 

from this gene; cDNA EST 

JykSOlh 10.3 comes from this 

[gene; cDNA EST yk501fl.3... 



881 | AF 100694 



Mus musculus 
Pontin52 mRNA. 
plete cds 



0.007 



2500SI4 



3.9 



3.8 



i5t 



WO 01/02568 



PCTYUS00/18374 



3*5* 


i Nearest Nciehbor f BlasiN vs. Genbank) 


Nearest NeUhbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


r DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












(U78289) tylactone synthase 




882 


X93567 


L. major mRNA for 
beta-tubulin f 1404bp 


0.007 


2317862 


modules 4 & 5 [Streptomyces 


3.0 


883 


AB012106 


Brassica rapa mRNA 
for SRK45, complete 
cds 


0.007. 


3881103 


^Ai-uj^o4o; predicted using 
Geneflnden cDNA EST 
EMBL:D76407 comes from this 
gene; cDNA EST 
EMBL:C08999 comes from this 
gene; cDNA EST yk!99b!2.5 
comes from this gene; cDNA 
EST yk282a4 J comes from this 
gene; cDNA EST EMBL:C0... 


2.7 


834 


AF041056 


Homo sapiens 
WSCR4 gene, e.xons 
3 and 4 


0.007 


135817 


THROMBIN RECEPTOR 
PRECURSOR human 
>gi|339677 (M62424) thrombin 
receptor [Homo sapiens] 


2.2 


885 


Ai-093268 


Rattus norvegicus 
homer- lc mRNA. 
complete cds 


0.007 


1723518 


HYPOTHETICAL 32.2 fCD 
PROTEIN C22E12.04IN 
CHROMOSOME I >gi|1220279 
(Z70043) unknown 


2.1 


886 


M74798 


Hevea brasiiiensis 3- 
hydroxy-3- 
methylglutaryl- 
coenzyme A 
reductase gene, 3' 
end. 


0.007 


I0012S2 


(D64003) polyA polymerase 


1.9 


887 


Z62997 


H.sapiens CpG DNA, 
read cps76gll.nl a . 


0.007 


1176532 


HYPOTHETICAL 111.9 KD 
PROTEIN C34E10.8IN 
CHROMOSOME ffl 
>gi|500731 (U10402) weakly 
similar to protein C kinase 
substrate [Caenorhabditis 


1.8 


8SS 


AF074386 


Sambucus nigra 
hevein-like protein 
mRNA. complete cds 


0.007 


2498317 


DVA-lPOLYPkO'MN 
PRECURSOR nematode 
)olyprotein antigen precursor 
Dictyocaulus vivipanis] 
>gi|1585421|prfl|2l24414A 
oolyprotein antigen/allergen 
Dictyocaulus vivipanis] 


1.2 


889 


1 

L29426 < 


Synechocystis species 
strain PCC 6803) 
irgA gene, complete 
:ds. 


0.007 


f 

3882275 f 


AB01S320) KIAA0777 protein 
Homo sapiens] 


1.1 



tot 



WO 01/02568 



PCT/US00/18374 





Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ED 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















890 


D83329 


Mus musculus DNA 
for prostaglandin D2 
synthase, complete 
cds 


0.007 


1001741 


(D64004) hypothetical protein 


0.97 


891 


AB012106 


Brassica rapa mRNA 
for SRK45, complete 
cds 


0.007 


1723928 


HYPOTHETICAL 11.6 KD 
PROTEIN IN NUT1-AR02 
INTERGENIC REGION 
PRECURSOR YGL149w - 
veast (Saccharomyces 


0.94 


892 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


0.007 


121452 


(JLUTSNIN, HIGH 
MOLECULAR WEIGHT 
SUBUNIT 12 PRECURSOR 
>gi|82606|pir||A24266 glutenin 
high molecular weight chain 12 
precursor - wheat >gi|21779 


0.79 


893 


AF027173 


Arabidopsis thai tana 
cellulose synthase 
catalytic subunit (Ath- 
A) mRNA, complete 
cds 


0.007 


927287 


(U30294) ORF2 [Prevotella 
ruminicola] 


0.35 


894 


Y11918 


H.sapiens IMAGE 
cDNA clone 268S1 


0.007 


1055188 


(U40061) contains similarity to 
transmembrane domains like 
those found in sugar transporter 
proteins 


0.26 


895 


L36827 


Mus Musculus 
alphaA-crystallin- 
binding protein I 


0.007 


4063019 


{AF0S3061) ABC transporter 
TliF fPscudornonas fluorescensl 


0.21 


896 


L36827 


Mus Musculus 
alphaA-crystailin- 
binding protein I 


0.007 


4063019 


(AF033061) ABC transporter 
TliF [Pseudomonas fluorescens] 


0.20 


897 


265719 


H.sapiens CpG DNA. 
clone 54c 10, reverse 
read cpg54cI0.rtla . 


0.007 


1097307 


HIC- 1 eene [Homo sapiens] 


0.20 


898 


AF064029 


Fielianthus tuberosus 
lectin 1 mRNA, 
complete cds 


0.007 


1174915 


UTROPHIN (DYSTROPHIN- 
RELATED PROTEIN 1) 
[DRPD(DRP) 

>gi|284488|pir||S2838I utrophin 
protein) [Homo sapiens] 


0.002 


899 


< 

AF051730 


Mus musculus 
:athepsin S (CatS) 
zene. exon 6 


0.007 


1707017 


U7S721) RNA helicase isolog 
Arabidopsis thaliana] 


0.001 



WO 01/02568 



PCT/US00/18374 



Nearest Neighbor (BlastN vs. Genbank) 



SEQ 

E> 1 ACCESSION I DESCRIPTION 



jOryctolagus 



Nearest Nei; 



P VALUE 1 ACCESSION 



:hbor (BlastX vs. Non- Redundant Proteins) 
DESCRIPTION 



P VALUE 



cuniculus 
gp42/basigin/OX- 
47/HT7 mRNA, 



900 U62398 complete cds. 



0.007 



901 | X76341 



[M.musculus 
glutathione reductase 
mRNA. 



2370494 



(298944) hypothetical protein 2e-04 



0.007 



3513303 



(AC005594) R26984J [Homo 
sapiens] 



902 



[Rat (lambda 20B0.5) 
M-type 6- 
|phosphofructo-2- 
kinase/fructose-2, 6- 
M26215 bisphosphatase 



0.007 



3036809 



(AL022373) putative 



putat 

(AB007902) HH<W12'cDNA 
clone for KIAA0442 has a 574- 
bp insertion at position 1474 of 
the sequence of KIAA0442. 
[Homo sapiens] | 2e-17 



903 



Homo sapiens 
KIAA0442 mRNA, 
AB007902 partial cds 



904 I U93364 



Lactococcus lactis 
cremoris plasmid 
pNZ4000 insertion 
'sequence IS982 
[putative transposase 
gene and eps gene 
cluster 

(epsRXABCDEFGHI 
JKL). complete cds 



905 | AF093268 



IRattus norvegicus 
homer- ic mRNA, 
complete cds 



0.007 



2662165 



0.007 



0.006 



2731377 



<NONE> 



(U28739) similar to alcohol 
de h y droge nase/ri bi tol 
dehydrogenase [Caenorhabditis 
e'egans] | le-31 



<NONE> 



906 



IMus musculus 
Pontin52 mRNA, 
AFIQQ694 completers 



907 



ISambucus nigra 
[hevein-like protein 
AF074386 mRNA. complete cds 



0.006 



<NONE> 



<NONE> 



<NONE> 



0.006 



<NONE> 



<NONE> 



908 I AF027174 



Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath 
B) mRNA, complete 
Icds 



0.006 



909 



lArabidopsis thaliana 
mRNA for 
neoxanthin cleavage 
AJ005813 [enzyme 



<NONE> 



0.006 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



I <NONE> 



WO 01/02568 



PCT/USOO/18374 





Nearest Neighbor (BlasiN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ED 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















910 


AF027174 


Aiabidopsis thaliana 
cellulose synthase 
catalytic subunii (Ath- 
B) mRNA, complete 
cds 


0.006 


<NONE> 


<NONE> 


<NONE> 


911 


AF093268 


Raitus norvegicus 
homer- 1c mRNA, 
complete cds 


0.006 


<NONE> 


<NONE> 


<NONE> 


912 


AF093268 


Rattus norvegicus 
homer- lc mRNA, 
complete cds 


0.006 


* <NONE> 


<NONE> 


<NONE> 


913 


AB012106 


Brassica rapa mRNA 
for SRK45, complete 
cds 


0.006 


<NONE> 


<NONE> 


<NONE> 


914 


AF064029 


Helianthus tuberosus 
lectin 1 mRNA, 
complete cds 


0.006 


<NONE> 


<NONE> 


<NONE> 


915 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


0.006 


<NONE> 


<NONE> 


<NONE> 


916 


AF093268 


Rattus norvegicus 
homer- lc mRNA, 
complete cds 


0.006 


4049856 


(AF063866) ORF MSV064 
hypothetical protein 
[Melanoplus sanguinipes 
entomopoxvirusl 


9.6 


917 


AF I 00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


0.006 


3880536 


U-SJU/U) predicted using 
Genefmder; similar to Lectin C- 
type domain short and long 
forms (2 domains); cDNA EST 
EMBL:C 10633 comes from this 
gene; cDNA EST 
EMBL;CL2424 comes from this 
gene; cDNA EST ykI9Ie7.3 
comes from this ... 


7.9 


918 


AFO 12899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


0.006 


3877761 


(ZS1552) F56G4.1 
Caenorhabditis elegans] 
>gi|3S786l5|gnI|PID|el348240 
(ZS3U8) F56G4.1 


7.5 


919 


XS0289 


H.sapiensPTPLl 
mRNA for protein 
tyrosine phosphatase 


0.006 


116S791 


CATHEPSIN E PRECURSOR 
precursor - rabbit >gi|402729 
(L0S41S) procathepsin E 


7.4 
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Nearest Neighbor (BlastN vs. Oenbank) 


Nearest Neighbor {BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












'DIACYLGLYLkkOL 




920 


AP074386 


Sambucus nisra 
hevein-like protein 
mRNA, complete cdb 


0.006 


1346371 


KINASE. BETA " " 

DIACYLGLYCEROL 

KINASE) 

> S i|477059|pir||A47744 
diacviffl vcerol kinase CPC 

2.7.1.107) beta -rat 90kDa- 
diacylglvcerol kinase [Rattus 


5.5 


921 


U72396 


Lycopersicon 
csculentum class II 
small heat shock 
protein Le-HSPl7.6 
mRNA, comDlete cds 


0.006- 


2196567 


(DS8588} linonrotein 
[Escherichia coli] 


4.3 


922 


AF074387 


Sambucus nigra 
hevein-like protein 
mRNA, complete cds 


0.006 


2113798 


(ZS3259) AmphiBrf38 
IBranchiostoma floridae] 


4.3 


923 


ABO 1 2 106 


Brassica rapa mRNA 
for SRK45 comolete 
cds 


0.006 


1388166 


\\JJv—v—i OUWC1 J_ J — ' I UJUUIllla 

melanogaster] 


4.3 


924 


AF074386 


Sambucus nigra 
hevein-like protein 
mRNA, complete cds 


0.006 


2496785 


HYPOTHETICAL 20.1 KD 
PROTEIN Y4YS 


4.2 


925 


AF012899 


Sambucus niura 
ribosome inactivating 

nmf^in nrpcnrsnr 

mRNA. complete cd.s 


0.006 


416592 


A- AGGLUTININ 
ATTACHMENT SUB UNIT 
PRECURSOR 
>3gi|l0Li70|pir||A41258 a- 
ajzglutinin core protein AGA1 - 

Vf^'ist ( S , i["f*h irnnn/fPQ 
> cuol v ijuniiuiuiii y^n 

cerevisiae) 


2.7 • 


926 


AF064029 


Helianthus tuberosum 
lectin 1 mRNA, 
complete cds 


0.006 


416592 


A- AGGLUTININ 
ATTACHMENT SUB UNIT 
PRECURSOR 
>gi|L0H70|pir||A41258 a- 
agglutinin core protein AGA1 - 
yeast (Saceharomyces 
cerevisiae) 


2.5 


927 


AJ005813 


Arabidopsis thaiiana 
mRNA for 
neoxanthin cleavage 
enzvme 


0.006 


3258584 


.141263) The 3'UTRofthis 
jicne overlaps the 3' UTR of 
T19Dl2.6(confirmed by EST 
lits) [Caenorhabditis eleeans] 


2.0 


928 


U33949 


Human Down 
Syndrome region of 
chromosome 2 1 . 
genomic sequence, 
clone A12HI-1A6. 


0.006 


3S50997 


AF067150) beta-hydroxyacyl- 
■\CP dehydratase precursor 


1.9 



WO 01/02568 
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■ 1 Nearest 
SEQj 

ID 1 ACCESSKtt 


Neiehbor (BlastN vs. ( 
1 DESCRIPTION 


P VALUE 


ACCESSION 


bor (BlastX vs. Non-Redundant P 
DESCRIPTION 


roteins) 
P VALUE 


1175 AF027173 


Arabidopsis thai i ana 
cellulose synthase 
catalytic subunit (Ath 
A) mRNA, complete 
cds 


2e-04 


<NONE> 


<NONE> 


<NONE> 


1 176 Y09232 


H. sapiens ferrilin 
alpha pseudoeene 


2e-04 


<NONE> 


<NONE> 


<NONE> 


L177I AJ005813 


Arabidopsis thaliana 
mRNA for 
neoxanthin cleavage 
enzyme 


2e-04 


• <NONE> 


<NONE> 


<NONE> 


11781 AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


2e-04 


<NONE> 


<NONE> 


<NONE> 


1 179 J AF072847 


Homo sapiens 
putative swelling- 
activated chloride 
channel (CLNS1A) 
gene, intron 6 


2e-04 


<NONE> 


<NONE> 


<NONE> 


11301 AFO 12899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


2e-04 


<NONE> 


<NONE> 


<NONE> 


1 

1181 1 IJ7fiS?d 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


2e-Q4 


<NONE> 


<NONE> 


<NONE> 


1 C 

1 C 

t 

1182 1 AF027173 |c 


Xrabidopsis thaliana 
ellulose synthase 
atalytic subunit (Ath- 
\) mRNA, complete 
ds 


2e-04 


1 
( 
i 
c 
c 
e 
f 
> 

1213557 e 


:U50199) coded for by C 
ilegans cDNA yk89e9.5; coded 
or by C. elegans cDNA cm7g5; 
;oded for by C. elegans cDNA 
:ml4b9; coded for by C. 
flegans cDNA yk52g5,5; coded 
or by C. elegans cDNA 
'k76e5.5; coded for byC. 
leganscDNA vkl 31fl 1.5; c... 


S.4 



WO 01/02568 
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SEQ 
ED 



Nearest Neighbor (BlastN vs. Genbank) 



ACCESSION 



DESCRIPTION 



P VALUE 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



DESCRIPTION 



P VALUE 



EF1 1 HhLlAL DU>LULL>u\ 



1 183 1 AF090I15 



Lycopersicon 
[esculentum cytosolic 
class II small heat 
shock protein HCT2 
(HSP17.4) mRNA > 
complete cds 



2e-04 



729008 



1184 



Sambucus nigra 
Iribosome inactivating 
protein precursor 
AF012899 mRNA, complete cds 



2e-04 



2507582 



DOMAIN tthLtKl'OR r 
PRECURSOR (TYROSINE- 
PROTEIN KINASE CAK) 
(CELL ADHESION KINASE) 
(TYROSINE KINASE DDR) 
(DISCOIDIN RECEPTOR 
TYROSINE KINASE) (TRK E) 
(PROTEIN- TYROSINE 
KINASE RTK 6) sap iens! 
UWoTUEilCALm.i Kb 
PROTEIN IN MOLR-BGLX 
INTERGENIC REGION 
>gi|I788436(AE0O030O) 
putative regulator [Escherichia 
coli] 



1185| AF074386 



Sambucus nigra 
Ihevein-like protein 
IrnRNA. complete cds 



2e-04 



1085500 



collagen alpha I (IX) chain - 
mouse musculus] 
>gi|744962|prfp0t5346A 
co!lagen:SUBUNIT=alphaI:ISO 
TYPE=IX [Mus musculus] 



8.3 



7.8 



U86| AF027173 



Arabidopsis thai i ana 
[cellulose synthase 
catalytic subunit (Ath 
A) mRNA, complete 
cds 



2e-04 



2623967 



1187 1 AJ005813 



Arabidopsis thaliana 
mRNA for 
neoxamhin cleavage 
enzvme 



USSl AF027174 



Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
B) mRNA, complete 
cds 



2e-04 



2497316 



2e-04 



1001710 



(Y 13942) GTN Reductase 
[Agroba cterium ra diobacter] 



GLYCOS YLATION END 
PRODUCT-SPECIFIC 
RECEPTOR PRECURSOR 
(RECEPTOR FOR 
ADVANCED 
GL YCOSYLATION END 
PRODUCTS) products receptor 
precursor - boWne >gi[16365 1 
(M91212) receptor for advanced 
glycosylation end products [Bos 
taurus) 



(D64004) hypothetical protein 



7.4 



5.3 



3.5 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neiehbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


r DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Arabidopsis thaliana 






(U4 1263) The 3' UTR of this 




1189 


AI005813 


mRNA tor 
neoxanthin cleavage 
enzvme 


2e-04 


3258584 


gene overlaps the 3' UTR of 
Ti9D12.6(confirmed by EST 
hits) [Caenorhabditis eieeans] 


2.1 


1190 


AF027173 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
A) mRNA, complete 
cds 


2e-04 


2736338 


(AF038623) contains similarity 
to RNA recognition motifs 


0.89 


1191 


U72396 


esculentum class II 
small heat shock 
protein Le-HSP17.6 
mRNA. complete cds 


2e-04 


2196567 


(D88588) lipoprotein 
[Escherichia coli] 


0.69 


1192 


AF090L15 


Lycopersicon 
esculentum cytosolic 
class II small heat 
shock protein rlL i l 
(HSP17.4) mRNA. 
complete cds 


2e-04 


3319874 


(AJ006096) F-spondin 
[Branchtostoma floridae] 


5e-04 


1193 


L26049 


Chlamydomonas 
reinhardtii dynein 
heavy chain alpha 
(ODA1 1) gene, exons 
2-15, and partial cds. 


2e-04 


3876775 


(281077) predicted using 
Genefinder; Similarity to Yeast 
protein 8248 (TR:G587531) 


2e-09 


1194 


AF 1 00694 


Mus musculus 
Pontin52 mRNA, 






<iNL/[Nb> 


<NUNt> 


1195 


AF064029 


Heliamhus tuberosus 
lectin 1 mRNA, 
complete cds 


le-04 


<NONE> 


<NONE> 


<NONE> 


1196 


L34219 


Homo sapiens 
retinaldehyde-binding 
protein (CRALBP) 
sene, complete cds. 


le-04 


<NONE> 


<NONE> 


<NONE> 


1197 


1 

X51S90 i 


Rhesus monkey 
nter!eukin-3 sene 


le-04 1 


<NONE> 


<NONE> 


<NONE> 



WO 01/02568 
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I Nearest 


Neighbor fBlastN vs. Genbank) 


Nearest Neichbor (BlastX vs. Non-Redundant Proteins) 


ID 1 ACCESSION 


i DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 




Plasmodium 










11981 AE0OI42I 


fslcinnriim 
chromosome 2. 
section 58 of 73 of 
the complete 
sequence 


Ie-04 


<NONE> 


<NONE> 


<NONE> 


1199 AF090115 


Lycopersicon 
esculentum cytosolic 
class II small heat 
shock protein HCT2 
(HSP17 4) mRNA 
complete cds 


le-04 


<NONE> 


<NONE> 


<NONE> 


1200 AF027174 


Arabidopsis thaliana 
cellulose synthase 
catalvtic subunit ( Ath- 
Bj mRNA, complete 
cds 


le-04 


2576287 


(Y 15086) HepC protein 
Cylindrotheca fusiform is] 


4.7 


1201 AJ005313 


Arabidopsis thaliana 
mRNA for 
ncoxanthin cleavage 
enzvme 


le-04 


3395673 


(AB016623) RWC-3 [Oryza 
sativa] 


0.14 


1202 AF038035 


1IVUIV JU|/1VIU 

BRCA1 -associated 
RING domain protein 
(BARD 1) gene, 
exons 2 and j 


ye -id 


<NONE> 


<NONE> 


<NONE> 


1203 AJ005813 


Arabidopsis thaliana 
mRNA for 
neoxanthin cleavage 
enzvme 


9e-05 


<NONE> 


<NONE> 


<NONE> 


12041 AB0I2106 


3rassica rapa mRNA 
for SRK45, complete 

:ds 


9e-05 


<NONE> 


<NONF:> 




1 

1 1 

1205 U95098 i 


Kenopus laevis 
mitotic 

)hosphoprotein 44 
riRNA, panial cds 


9e-05 


<NONE> 


<NONE> 


<NONE> 


I 

1 c 
1 c 

1 s 

1206 1 AF034099 r 


-accaria bicolor 
.lyoxal malate 
ymhase protein 
nRN A. complete cds 


9e-05 


I 
I 
I 

Y 

1351553 5 


4VP0iHtriCAL 
APOPROTEIN MG348 
>R£CURSOR 
>gi|l361668|pir||E6423S 
lypothetical protein MG348 - 
Mycoplasma genitalium (SGC3) 
►ai|3844931 


8.8 
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Nearest Neighbor f BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1207 


D50006 


riuman ur*rv tut 
alpha-platelet-derived 
growth factor 
receptor, exon 6-10 


9e-05 


3063639 


(AF056494) NADH 
dehydrogenase subunit 5 
[Panorpa japonica) 


5.1 


1208 


U50423 


Human Down 
Syndrome region of 
enromoscme -i, 
clone A41B8-IB7. 


9e-05 


124273 


INHIBIN ALPHA CHAIN 
PRECURSOR bovine 
>ffilIfvH95 fM 13273^ inhibin A 
subunit [Bos taurus] 


3.0 


1209 


AJ005813 


Arabidopsis thaliana 

m D M A fnr 

nebxanthin cleavage 
enzyme 


9e-05 


4007782 


(X72850) 2.4- 
dihvdroxvbenzoate 
monooxygenase [Sphingomonas 
sp.] 


2.3 


1210 


AC005276 


f-fnmfv «!iniprT\ tlone 

fragment 

UWGC:gap3 from 
tQj i.a, compicic 
sequence [Homo 
sapiensl 


9e-05 


1492075 


(U60315) MC132L [Molluscum 
contaaiosum virus subtype I] 


1.0 


1211 


AF100694 


Mus musculus 
complete cds 


9e-05 


2887423 


( A R 007 ft 84^ KTAA04'>4 fHomo 
sapiens] 


2e-10 


1212 


X77772 


C. fuse us gamma-M2- 
1 crystallin mJRNA. 


9e-05 


2072425 


(U83 1 15) non-lens beta gamma- 
sapiens] 


7e-25 


1213 


AB012106 


Brassica rapa mRNA 
for SRK45, complete 
cds 


8e-05 


<NONE> 


<NONE> 


<NONE> 


1214 


L06178 


A nit.- ma*lltf'^f*1 

Apis rneinrcrj 
lisustica complete 
mitochondrial 
genome 


Se-05 


<NONE> 


<NONE> 


<NONE> 


1215 


ABO 12 106 


Brassica rapa mRNA 
for SRK45, complete 
cds 


8e-05 


<NONE> 


<NONE> ' 


<NONE> 


1216 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


8e-05 


<NONE> 


<NONE> 


<NONE> 


1217 


L06178 


Apis meliifera 
ligustica complete 
mitochondrial 
iienome 


Se-05 


<NONE> 


<NONE> 


<NONE> 


1218 


ABO 12 106 


Brassica rapa rnRNA 
for SRK45, complete 
cds 


Se-05 


<NONE> 


<NONE> 


<NONE> 



140 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neiahbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Mus musculus 










1219 


AF100694 


Pontiit52 mRNA, 
complete cds 


8e-05 


<NONE> 


<NONE> 


<NONE> 


1220 


ABO 12 106 


Brassica rapa mRNA 
for SRK45, complete 
cds 


8e-05 


<NONE> 


<NONE> 


<NONE> 


1221 


AB0I2106 


Brassica rapa mRNA 
forSRK45. complete 
cds 


8e-05 


1722841 


WNT-U PROTEIN 
PRECURSOR (XWNT-Il) 
clawed frog >gi|439108 
(L23542) maternal protein 


9.9 


1222 


AF027I74 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
B) mRNA, complete 
cds 


8e-05 


1205991 


(U35637) nebulin [Homo 
sapiens] 


9.6 




AF024605 


Homo sapiens serine 
protease-like protease 
Sequence 2 from 
patent US 5736377 


8e-05 


3242783 


(AF055354) respiratory burst 
oxidase protein B 


8.6 


1224 


Y13148 


Rattus norvegicus 
mRNA for PAG608 
eene 


8e-05 


2314243 


(AE0006I6) alpha-ketoglucarate 
Dermease fketP) 


8.1 


1225 


AJ0O5813 


Arabidopsis thaliana 
mRNA for 
neoxanthin cleavage 
enzvme 


8e-05 


1170586 


kAS GlPASL-ALliVAlliSb- 
UKE PROTEIN IQGAPl 
(Pl95)(KIAA005l) 
>gij627594|pir||A54854 Ras 
GTPase activating-related 
protein - human sapiens] 
>gi|536844 (L33075) ras 
GTPase-activating-like protein 
[Homo sapiens | 


7.8 


1226 


AF027173 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
A) mRNA. complete 
cds 


8e-05 


464239 


NADH-UDltjUliNUiNri 
OX1DOREDUCTASE CHAIN 
4>gi|10S5lS5|pir[|S5:968 
NADH dehydrogenase chain 4 - 
honeybee mitochondrion 
(SGC4) >gi|552446 (L06178) 
NADH dehydrogenase subunit 4 
[Apis mellirera ligustica] 


3.5 


1227 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


8e-05 


544353 


F-SPOND1N PRECURSOR 


3.5 



9m i 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. No n- Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1223 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


8e-05 


483243 


apolipoprotein B- 100 - chicken 
(fragment) 


3.4 


1229 


AF09326S 


Rattus norvegicus 
homer* lc mRNA, 
complete cds 


8e-05 


91207 


proline-rich protein - mouse 
(fragment) musculus] 


2.2 


1230 


AF027173 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
A) mRNA, complete 
cds 


8e-05 


2499181 


ZONADHES1N PRECURSOR 
>si| 1066466 


2.2 


1231 


AF027173 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
A) mRNA, complete 
cds 


8e-05 


2499181 


ZONADHESIN PRECURSOR 
>gi| 1066466 


1.9 


1232 


AB012106 


Brassica rapa mRNA 
for SRK45, complete 
cds 


8e-05 


2833647 


(AF027972) flagelliform silk 
protein fNephtla clavipes] 


1.6 


1233 


AF093268 


Rattus norvegicus 
homer- lc mRNA, 
complete cds 


8e-05 


1163063 


(Z4982I) MY02 
[Saccharomyces cerevisiae] 


0.90 


1234 


AF027174 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
B) mRNA, complete 
cds 


8e-05 


1653488 


(D90914) hypothetical protein 


0.30 


1235 


M26510 


Chicken nonmuscle 
myosin heavy chain 
(MHC) gene, 
complete cds. 


8e-05 


112159 


plecttn • rat 


0.003 


1236 


U56402 


Human chromatin 
structural protein 
homo log 


8e-05 


2083823 


(AF0O3384) weak similarity to 
the peptidase family A2 


lc-13 


1237 


AF 100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


8e-05 


437181 


(U02289) GTPase-activating 
protein [Caenorhabditis eleeans] 


2e-17 


123S 


AF 1 00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


8e-05 


465983 


HYPOTHETICAL 80.8 KD 
PROTEIN ZC21.4 IN 
CHROMOSOME III 


8e-27 



<2H<\ 
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Nearest Neiahbor (BlastN vs. Genbank) 


Nearest Neighbor (BtastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1239 


AF090115 


Lycopcrsicon 
esculentum cytosolic 
class II small heat 
shock protein HCT2 
(HSP17.4) mRNA, 
complete cds 


7e-05 


<NONE> 


<NONE> 


<NONE> 


1240 


U83656 


Rattus norvegicus NF- 
KB gene, pro mo cor 
region 


7e-05 


3880858 


(AL03 1633) predicted using 
Cenefinder; cDNA EST 
yk304f 1 2.5 comes from this 
gene [Caenorhabditis elegansl 


9.3 


1241 


AF074387 


Sambucus nigra 
hevein-like protein 
mRNA, complete cds 


7e-05 


3080538 


(AL022600) hypothetical 
protein 


9.2 


1242 


XS9398 


H.sapiens ung gene 
for uracil DNA- 
glycosylase 


. 7e.05 


549700 


HYRHHtLlLALlU iOJ 
PROTEIN IN MDH1-VMA5 
INTERGENIC REGION 
>ai|539l82|pir||S3790S 
hypothetical protein YKL083w - 
yeast (Saccharomyces 
cerevisiae) >gi|4S6i20 
(Z280S2) ORF YKL083w 


l.S 


1243 


M83753 


Bovine follicle 
stimulating hormone- 
beta subunit gene, 
complete cds. 


7c-05 


2398621 


(AJ000342) DMBT1 protein, 
5.8 kb transcript [Homo sapiens] 


1.8 


1244 


M80829 


Rat troponin T 
cardiac isofcrm gene, 
complete cds 


5e-05 


854065 


(XS3413)USS [Human 
herpesvirus 6] 


2e-08 


1245 


AF074387 


Sambucus nigra 
hevein-Iike protein 
mRNA. complete cds 


4e-05 


120240 


FLAGELLIN B2 PRECURSOR 

Methanococcus voltae 

>gi| 150063 (M7214S) tlasellin 


5.2 


1246 


AF012899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


3e-05 


<NONE> 


<NONE> 


<NONE> 


1247 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


3e-05 


<NONE> 


<NONE> 


<NONE> 


124S 


AF0743S6 


Sambucus nigra 
hevein-like protein 
mRNA. complete cds 


3e-05 


<NONE> 


<NONE> 


<NONE> 
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SEQ 
ID 


Nearest M 
ACCESSION 


eishborfBlastN vs. Gc 
DESCRIPTION 


nbank) 
P V ALUt 


Nearest Neighbc 


r(BlastX vs. Non-Redundant Pro 
DESCRIPTION 


teins) 
P VALUE 


1249 


AF093268 


Rattus norvegicus 
homer- ic mRNA. 
complete cds 


3e-05 


<NONE> 


<NONE> 


<NONE> 


1250 


AB012106 


Brassica rapa mRNA 
for SRK45, complete 
cds 


3e-05 


2773226 


(AF039716) Similar to protein 
kinase [Caenorhabditis elegansl 


6.7 


1251 


AF 100694 


Mus rnusculus 
Pontin52 mRNA, 
complete cds 


3e-05 


2072961 


(UyjjOiS) putative puu ^nuniu 
sapiens] 


5.6 


1252 


U72396 


Lycopersicon 
esculentum class II 
small heat shock 
protein Le-HSP 17.6 
mRNA. complete cds 


3e-05 


121855 


kXOGLUCA:\ASb Li 

PRECURSOR cellulose 1.4-beta 

cellobiosidase (EC 3.2.1.91) II 

nroriircnr fimffi f Trichoderma 
precursor lun—ua \ * • iwnwuvi n»*» 

reesei) 1 ,4-beta-cellobiosidase 

(EC 3.2.1.91) II- fungus 

cellobiohydrolase II 

[Trichoderma reesei] 


4.6 


1253 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


3e-05 


3880516 


(AL021572) similar to CTP 
SYNTHASE (EC 6.3.4.2) (UTP- 
- AMMONIA LIGASE) (CTP 
£>YN I Hb I AifcJ 


3.3 


1254 


M88299 


Mouse brain- 1 POU- 
domain protein, 
complete cds. 


3e-05 


1947048 


(U66102) intimin [Escherichia 
colil 


3.0 


1255 


U95098 


Xenopus laevis 
mitotic 

phosphoprotein 44 
mRNA, partial cds 


3e-05 


3122872 


CELL-CYOLfc NUCLHAk 

AUTO ANTIGEN SG2NA 

> s i|l082650|pir||JC2522 nuclear 
autoantigen - human >gi|805095 
(U179S9>GS2NA 


2.8 


1256 


U76524 


ribosome inactivating 
protein precursor 
mRNA, complete cds 


3e-05 


1352145 


CYTOCHROME C OJUbASfc 
POLYPEPTIDE I chain I - 
Thermus aquaticus >gi| 155083 
(M84341) cytochrome c oxidase 
subunits precursor (Thermus 
thermophilus] 


2.6 


1-257 


U72396 


Lycopersicon 
esculentum class II 
small heat shock 
protein Le-HSP17.6 
mRNA. complete cds 


3e-05 


28U015 


SEGMENT ATIOrJ pOLaritV 
PROTEIN ENGRAILED 
>gi|2076747 (U42429) 
engrailed [Anopheles gambiae] 
>gi|21439iS(L*42214) 
enerailed [Anopheles aambiae] 


2.0 



is ( 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
(D 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1258 


AF027174 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Aih- 
B) mRNA, complete 
cds 


3e-05 


1657752 


(U62325) FE65-like protein 
[Homo sapiens 1 


1.7 


1259 


AF 100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


3e-05 


2072961 


(U93568) putative pl50 [Homo 
sapiens] 


1.5 


1260 


U76523 


Sambucus nigra lectin 
precursor mRNA, 
complete cds 


3e-05 


1352145 


CYTOCHROME C OXIDASE 

POLYPEPTIDE I chain I - 
Therm us aquaticus >gi| 155083 
(M84341) cytochrome c oxidase 
subunits precursor [Thermus 
thermophilus) 


1. 1 


1261 


X91890 


Rsapiens regulatory 
region of HOXA7 
gene 


3e-05 


111013 


Sxr (Bkm-homolog) sex- 
determining region protein - 
mouse 


1.0 


1262 


L36936 


Homo sapiens mctase 
gene, partial cds. 


3e-05 


1944352 


(D84239) IgG Fc binding 
protein [Homo sapiens 1 


0.99 


1263 


AB012105 


Brass ica rapa mRNA 
for SLG45. complete 
cds 


3e-05 


4177S2 


SMP2 PROTEIN 
>gi|320853|pir||S30911 SMP2 
protein - yeast (Saccharomyces 
cerevisiae) gene 
[Saccharomyces cerevisiael 


0.39 


1264 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


3e-05 


1708501 


1NTEGRIN ALPHA CHAIN- 
LIKE PROTEIN alpha Intlp 
[Candida albicans] 


0.39 


1265 


AF090115 


Lycopersicon 
esculentum cytosolic 
class II small heat 
shock protein HCT2 
(HSPL7.4) mRNA. 
complete cds 


3e-05 


15S7031 


cis-Golgi matrix protein GM130 
[Raitus norveaicus] 


0.20 


1266 


ZS1014 


Human DNA 
sequence from 
cosmid U65A4, 
between markers 
DXS366 and DXSS7 
on chromosome X * 


3e-05 


2072964 


(U93569) putative pi 50 [Homo 
sapiens) 


0.049 
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Mpnrest Neighbor fBlastN vs. Genbank) 


Nearest Neiahbor (BlastX vs. Non-Redundant Pre 


teins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












glycosylated ana mynstilated 




1267 


296668 


H.sapicns tclomeric 
DNA sequence, clone 
7PTEL001,read 
7PTELOO001.seq 


3e-05 


542429 


smaller surtace antigen - 
Plasmodium falciparum 
>gi|836640 (X76298) 
glycosylated and mynstilated 
smaller surface antigen gallus] 
>gi|1092178|prfl|2023165B 
surface antieen 


0.029 


1268 


ABO 12 105 


Brass ica rapa mRNA 
for SLG45, complete 
cds 


3e-05 


3879121 


(Z70310) predicted using 
Genefinder; Similarity to Mouse 
ankyrin (PER Acc. No. S37771); 
cDNA EST EMBLT01923 
comes from this gene; cDNA 
EST EMBL , D3' , 335 comes 
from this gene; cDNA EST 
EMBL:D32723 comes from this 
gene; cDNA ES... Genefinder; 
oimiianty to LYiOUiC anKvnn 
(PIR Acc. No. S37771); cDNA 
EST EMBL:T01923 comes 
from this gene; cDNA EST 
EMBL:D32335 comes from this 
gene; cDNA EST 
EMBL:D32723 comes from this 
gene;cDNAES... 


2e-13 


1269 


AF074385 


Sambucus nigra 
hevein-like protein 
mRNA, complete cds 


3e-05 


2497677 


ZYXIN (ZYXIN 2) sapiens] 
>gi|l545954|gnl|PID|e2234l7 
(X95735) zvxin 


2e-23 


1270 


AF027174 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
B) mRNA, complete 
cds 


le-05 


<NONE> 


<NONE> 


<NONE> 


1271 


X16318 


Canine mRNA for 
signal recognition 
particle 54k protein 


le-05 


3122612 


PITUITARY HOMEOBOX 3 
(HOMEOBOX PROTEIN 
PITX3) >gi|2645427 
(AFQ05772) homeobox protein 
Pit\3 [Mus musculus] 


4.4 


1272 


ABO 12 105 


Brassica rapa mRNA 
for SLG45, complete 
cds 


le-05 


1652458 


(D90905) DNA mismatch repair 
protein MutL (Synechocystis 
sp-1 


0.62 
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SEQ 
ID 


Nearest N 
ACCESSION 


eiehbor (BlasiN vs. Ge 
DESCRIPTION 


nbank) 
P VALUE 


Nearest Neiehbo 
ACCESSION 


r (BiastX vs. Non-Redundant Pro 
DESCRIPTION 


teins) 
P VALUE 


1273 


] 

: 

< 

U57843 


Human 

phosphatidylinositot 
3-kinase delta 
catalytic subunit 
mRNA. complete cds 


le-05 


475909 


(X67098) ORF1A [Homo 
sapiens] 


0.22 


1274 


Z96569 


H.sapiens telomeric 
DNA sequence, clone 
2QTEL054, read 
2OTELOO054.seq 


le-05 


2137043 


unknown protein - rabbit 

ffraement) cunicuius] 

ViuS^J i) nullum it; m .nuuji 


0.005 


1275 


AE0008 10 


Methanobacterium 
thermoautotrophicum 
from bases 172512 to 
1S2957 (section 16 of 
148) of the complete 
aenome 


le-05 


3877579 


kinensin-like protein KIF4 
(SW:P33174); cDNA EST 
EMBL:D27320 comes from this 
gene; cDNA EST 
EMBL:D27322 comes from this 
gene; cDNA EST 
EMBL;D27321 comes from this 
gene; cDNA EST 
EMBL;1jjj /o4 comes... jviousc 
kinensin-like protein KIF4 
(SW:P33174);cDNA EST 
EMBL:D27320 comes from this 
gene; cDNA EST 
EMBL:D27322 comes from this 
gene; cDNA EST 
EMBL:D27321 comes from this 
gene; cDNA EST 
EMBL:D35764 comes... 


6e-27 


1276 


AB0121 13 


Homo sapiens gene 
for CC chemokine 
PARC precursor, 
complete cds 


9e-06 


<NONE> 


<NONE> 


<NONE> 


1277 


AC005830 


Homo sapiens Xpii- 
154-155 BAC GSHB- 
52411 (Genome 
Systems Human BAC 
Library), complete 
sequence [Homo 
sapiensl 


9e-06 


<NONE> 


<NONE> 


<NONE> 


127S 


DS6245 


Human MHC (HLA) 
DRB intron 1 DNA, 
partial sequence 


9e-06 


1051253 


(U37531) mucin apoprotein 
[Mus musculusj 


1.3 


1279 


D7999S 


Human mRNA tor 
KIAA0176 gene, 
partial cds 


9e-06 


2833253 


HYPOTHETICAL PROTEIN 
KIAAO 176 sapiens! 


4e-06 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












(zoyojo; oimuamy to xeast 




1280 


U10246 


Toxoplasma gondii 
RH uracil 
phosphoribosyl 
transferase gene, 
complete cds. 


9e-06 


3876090 


undine Kinase 

(SW:URKl_YEAST); cDNA 
EST EMBL:Z14695 comes 
from this gene; cDNA EST 
CEMSE17F comes from this 
gene; cDNA EST 
EMBL:D67355 comes from this 
gene; cDNA EST yk209hl.5 
comes from this se... 


7e-33 


1281 


U 10246 


Toxoplasma gondii 
RH uracil 
phosphoribosyl 
transferase scne, 
complete cds. 


9e-06 


3876090 


(Z6%Jij Similarity to Yeast 
uridine kinase 

(SW:URK1_ YEAST); cDNA 
EST EMBL:Z 14695 comes 
from this gene; cDNA EST 
CEMSE17F comes from this 
gene; cDNA EST 
EMBL:D67355 comes from this 
gene; cDNA EST yk209hl.5 
comes from this °e... 


7e-34 


1282 


AF012899 


Sambucus nigra 
ribosomc inactivating 
protein precursor 
mRNA, complete cds 


8e-06 


<NONE> 


<NONE> 


<NONE> 


12S3 


AFO 12899 


Sambucus nigra 
ribosomc inactivating 
protein precursor 
mRNA. complete cds 


8e-06 


<NONE> 


<NONE> 


<NONE> 


1284 


U66340 


Human Rh blood 
<toud C anti°en 
(RHCE) gene, exon 
2. partial cds 


8e-06 


1707155 


(U80837) F07E5.6 gene product 
[Caenorhabditis elegans] 


9.6 


L285 


AF012899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


7e-06 


<NONE> 


<NONE> 


<NONE> 


1286 


M29930 


Human insulin 
receptor (allele 2) 
gene, exons 14, 15. 
16 and 17. 


4e-06 


<NONE> 


<NONE> 


<NONE> 


1287 


L42103 


Homo sapiens 
(subclone 5_d3 from 
PI H25) DNA 
sequence. 


3e-06 


<NONE> 


. <NONE> 


<NONE> 



1S£ 
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SEQ 
ID 


Nearest N 
ACCESSION 


eiahbor 'BlastN vs. Gc 
DESCRIPTION 


nbank) 
P VALUE 


Nearest Neiehbc 
ACCESSION 


r (BlastX vs. Non-Redundant Pro 
DESCRIPTION 


teins) 
P VALUE 


1288 


AFQ 12244 


Mus musculus 
:erberus-like (Cer-1) 
>ene. complete cds 


3e-06 


<NONE> 


<NONE> 


<NONE> 


1289 


Z69366 


Human DNA 
sequence from 
cosmid L96F8, 
Huntington's Disease 
Region, chromosome 
4pl6.3 contains EST. 


3e-06 


<NONE> 


<NONE> 


<NONE> 


1290 


Z69366 


Human DNA 
sequence from 
cosmid L96FS. 
Huntington's Disease 
Region, chromosome 
4pl6.3 contains EST. 


3e-06 


<NONE> 


<NONh> 




1291 


X85232 


H.sapicns 
chromosome 3 
sequences 


3e-06 


<NONE> 


<NONE> 


<NONE> 


1292 


M32674 


Human platelet 
glycoprotein Ilia, 
exons 7. 8 and 9. 


3e-06 


<NONE> 


<NONE> 




1293 


D16S79 


Human HepG2 3' 
region cDNA. clone 
hmd2a0l 


3e-06 


998296 


(U33484) ependymin 
fHemiodus sp.l 


5.6 


1294 


U18614 


Lagothrix lagotncha 
interphoto receptor 
rctinoid-binding 
protein (IRBP) gene, 
intron 1. complete 
sequence 


3e-06 


1613846 


(U71440) polyprotein [Rice 
tunero spherical virusl 


5.0 


1295 


AF090115 


Lycopersicon 
esculentum cytosolic 
class II small heat 
shock protein HCT2 
(HSP17.4) mK*NA, 
complete cds 


3e-06 


1477646 


(U53204) plectin [Homo 
sapiens] >gi|l47765 1 (U63610) 
plectin [Homo sapiens] 


4.0 


1296 


AFQ 16898 


Homo sapiens B-ATF 
aene, complete cds 


3e-06 


1085177 


reverse transcriptase - fruit fly 
reverse transcriptase 
[Drosophila vakuba] 


3.0 


1297 


ABO 1 8490 


Homo sapiens DNA, 
trinucleotide repeats 
reeion 


3e-06 


3876572 


(Z81522) predicted using 
Genefinder; similar to RNA 
recognition motif, (aka RRM, 
RBD, or RNP domain) 
[Caenorhabditis elegans] 


3.0 



13? 
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PCT7US00/18374 



SEQ 
ID 



Nearest Neighbor iBlastN vs. Genbank) 



DESCRIPTION 



P VALUE 



Nearest M-i»hhnr (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



DESCRIPTION 



P VALUE 



Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
B) mRNA, complete 
cds 



3e-06 



4240137 



(AB020631) KIAA0824 protein 
[Homo sapiens] 



2.7 



omo sapiens 
adenosine 
monophosphate 
deaminase I 
(AMPDl)gene, 
exohs 11-12. 



3e-06 



1653775 



(D90916) thiol:disulfide 
interchange protein DsbD 
fSynechocystis sp.j 



1.7 



M37929 



-lomo sapiens 
adenosine 
monophosphate 
deaminase 1 
(AMPDl) gene, 
exons 11-12. 



3e-06 



1653775 



(D90916) thiol:disulfide 
interchange protein DsbD 
[Synechocystis sp.] 



1.7 



Glycine max actin 
(Soy86) gene, partial 
cds 



3e-06 



1730738 



ACTIN-LIKE PROTEIN .ARP5 
Ynl2430p [Saccharomyces 
cerevisiaej 



2e-05 



Yersinia 

pseudotuberculosis 
rpIC rplD. rplW, 
rplB and rpsS genes 
for ribosomal proteins 
L3,L4.L23, L2 and 
L9 



3e-06 



585879 



50S RIBOSOMAL PROTEIN 

L2 maritima >gi|437926 

fZ2 16771 ribosomal protein L2 



2e-12 



H.sapiens DNA for 
microsatellite 
polymorphism 



2e-06 



<NONE> 



<NONE> 



<NONE> 



X64707 



H.sapiens BBC1 
mRNA 



le-06 



<NONE> 



<NONE> 



<NONE> 



Homo sapiens Xpii 
154-155 B AC GSHB 
52411 (Genome 
Systems Human BAC 
Library), complete 
sequence [Homo 
sapiens] 



le-06 



<NONE> 



<NONE> 



<NONE> 



Human electron 
transfer flavoprotein 
alpha-subunit mRNA 
complete cds. 



le-06 



<NONE> 



<NONE> 



<NONE> 



WO 01/02568 
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SEQ 
ID 


Nearest N 
ACCESSION 


DESCRIPTION 


nhank) 

P VALUE 


ACCESSION 


r (BlastX vs. Non-Redundant Pro 
DESCRIPTION 


teins) 
P VALUE 


1307 


L25647 


iomo sapiens 
fibroblast growth 
factor receptor gene 
(located in the central 
MHO signal peptide 
and consecutive exon 


Le-06 


1586734 


mxcQ gene [Methylobacterium 
orsanophiluml 


5.4 


1308 


L26261 


Human MHC class III 
HLA-RP1 sene. 


le-06 


1684985 


(U20633) NADH 
dehydrogenase subunit 
[Neuwiedia veratri folia] 


1.8 


1309 


AF0022S3 


Mus musculus alpha- 
actinin-2 associated 
LIM protein mRNA, 
alternatively spliced 
jroduct. complete cds 


le-06 


2996196 


(AF053367) carboxyl terminal 
LIM domain protein [Mus 
musculus] 


4c- 17 


1310 


M 10935 


Human haptoglobin 
gene (alpha-2 allele), 
complete cds and 
haptogiobin-reiated 
gene, exon 1 and 
three Alu repeats. 


6e-07 


r 

<NONE> 


<NONE> 


<NONE> 


L311 


AC002251 


Homo sapiens 
(subclone l_g6 from 
BAC H76) DNA 
sequence 


4e-07 


2144491 


coagulation factor Xa (EC 
3.4.21.6) precursor norveaicusl 


4.2 


1312 


AF047717 


Streptomyces 
chrysomallus 
actinomycin 
synthetase II (acmB) 
yene. complete cds 


4e-07 


699196 


(U151S1) 4-coumarate-coA 
lisase Mycobacterium leprael 


le-06 


1313 


U14417 


Human Ral guanine 
nucleotide 
dissociation 
stimulator rnRNA, 
partial cds. 


4e-07 


544402 


OU AiN UN n IN ULLtu i luc 
DISSOCIATION 
STIMULATOR RALGDS 
FORM A (RALGEF) 
>oil3' , P57|pir||S28415 guanine 
nucleotide dissociation 
stimulator ralGDS - mouse 
>gij 193573 (L07924) guanine 
nucleotide dissociation 
stimulator [Mus musculus] 


8e-0S 


1314 


Z79027 


H.sapiens flow -sorted 
chromosome 6 
Hindlll fragment. 
SC6pA20G8 


3e-07 


<NONE> 


<NONE> 


<NONE> 



15% 
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Nearest Neiehbor (BlastN vs. Genbank) 


Nearest Neiahbor (BlastX vs. No n- Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Homo sapiens 










1315 


U67167 


intestinal mucin 
(MUC2) gene, 
promoter region and 
partial cds 


3e-07 


<NONE> 


<NONE> 


<NONE> 


1316 


AF086256 


Homo sapiens full 
length insert cDNA 
clone ZD41C11 


3e-07 


<NONE> 


<NONE> 


<NONE> 


1317 


U67228 


Human clone HS4.61 
Alu-Ya5 sequence 


3e-07 


1938437 


(U97003) contains similarity to 
C4-type zinc fingers and a 
ligand-binding domain of 
nuclear hormone receptors 


2.3 


13 IS 


U94346 


Human calpain-like 
protease (htra-3) 
mRNA. complete cds 


3e-07 


2911858 


(AF047659) No definition line 
found (Caenorhabditis elegans] 


0.39 


1 J 17 


Y 15724 


Homo sapiens 
SERCA3 gene, exons 
1-7 (and joined CDS) 


le-07 


<NONE> 


<NONE> 


<NONE> 


1320 


X13596 


Bean DNA for 
glycine-rich cell wall 
protein GRP 1.8 


le-07 


<NONE> 


<NONE> 


<NONE> 


1321 


M83094 


Homo sapiens 
cytosolic selenium- 
dependent glutathione 
peroxidase gene, 
complete cds, and 
rhohl2eene. 3* end. 


le-07 


1326385 


(u:>8751) C07G1.7 gene 
product [Caenorhabditis 
elegans] 


8.0 


1322 


Z55905 


Rsapiens CpG DNA, 
clone 7 1 f4, forward 
readcpg71f4.ftla . 


le-07 


1076802 • 


extensin-like protein - maize 
>si|600 H8 mays] 


0.61 


1323 


X03541 


Human mRNA of trk 
oncogene > :: 

gD|iyoiooiiyoioo 
Sequence 23 from 
patent US 5734039 


le-07 


325465 


(M74509) [Human endogenous 
retrovirus type C oncovirus 
sequence.], gene product (Homo 
sapiens] 


3e-04 


1324 


AF027766 


Canis familiaris Y- 
linked zinc finger 
protein 


le-07 


220643 


(D 10628) zinc finger protein 
Mus musculus] 


7e-08 


1325 


D13613 


Bovine mRNA tor 
rabphilin-jA, 
complete cds > 
dbj|E07S09|E07S09 
cDNA encoding 
rabphilin-3A 


le-07 


2822161 


(AC0040S2) rab3 effector-tike; 
35% Similarity to AF007S36 
(PlD;g23 17778) [Homo 
sapiens] 


6c- 1 1 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non- Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Human mRNA For c- 






(J04169) eaa-onc fusion protein 




1326 


X57U0 


cbl proto-oncogene 


ie-07 


323270 


[Cas NS1 retrovirusl 


3e-l4 


1327 


X57L10 


Human mRNA for c- 
cbl proto-oncogene 


le-07 


1 15855 


PROTO-ONCOGENE C-CBL 
human >gi|2973l (X57110) c- 
cbl protein [Homo sapiens] 


4e-19 


1328 


AC001I78 


Homo sapiens 
(subclone 2_gl2 from 
BAC H94) DNA 
sequence 


4e-08 


<NONE> 


<NONE> 


<NONE> 


1329 


UU866 


Human interleulcin-8 
receptor type B 
(EL8RB) gene, 
promoter and exons I- 
6 


4e-08 


<NONE> 


<NONE> 


<NONE> 


1330 


AC001225 


Homo sapiens 
(subclone 2_e6 from 
BAC H94) DNA 
sequence 


4e-0S 


478184 


histone HI II- 1 (clone L95) - 
midee 


6.5 


1331 


M73837 


Human modulator 
recogniiion factor 2 
(MRF-2) mRNA, 
complete cds. 


4e-08 


14144S 


HYPOTHETICAL 32.6 Kb 
PROTEIN IN TRANSPOSON 
TN4556 >gi|80758|pir||JQO428 
hypothetical 32.6K protein - 
Streptomyces fradiae transposon 
Tn4556 


4.7 


1332 


AC006164 


Homo sapiens clone 
UWGC:y28gap from 
6p2 1 , complete 
sequence (Homo 
sapiensl 


4e-03 


2580578 


(AF000996) ubiquitous TPR 
motif, Y isoform [Homo 
sapiens] 


1.2 


1333 


X01060 


Human mRNA for 
transferrin receptor 


4e-0S 


135514 


T-CELL RECEPTOR BETA 
CHAIN PRECURSOR 
precursor (ANA ID- rabbit 


0.61 


1334 


Y 10697 


Rsapiens INE2 
mRNA 


4e-0S 


124909 


INSULIN RECEPTOR- 
RELATED PROTEIN 
PRECURSOR (IRR) (IR- 
RELATED RECEPTOR) 
>gi|186555 sapiens] 


0.14 


1335 


U60416 


Rattus norvegicus 
myr 6 myosin heavy 
chain mRNA, 
complete cds 


4e-0S 


102 1S9 


myosin L high molecular weight 
- Acanihamoeba sp 


3e-0S 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neiehbor (BlastX vs. Non-Redundant Pre 


)teins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












HYPOTHLUCAL ai.2 KD 




1336 


U23804 


Drosophila 
melanogaster putative 
GTP-binding_ 
regulatory protein 
beta chain (GPB) 
mRNA. partial cds. 


4e-08 


2494916 


TRP-ASPttfcPEATS 
CONTAINING PROTEIN 
T10F2.4 IN CHROMOSOME 
III protein; similar to G-Beta 
repeat region (Trp-Asp 
domains) of guanine nucleotide 
binding protein 


le-28 . 


1337 


AE000213 


Escherichia coli K-12 
MGL655 section 103 
of 400 of the 
complete genome 


4e-08 


. . 3294172 


(AL022325) tF27C3.1.1 
(protein similar to C. elegans 
protein B0035.16) (isoform 1) 
[Homo sapiens] 


2e-67 


1338 


D8982L 


Mus musculus mRNA 
for RhoM, complete 
cds 


2e-08 


3024539 


RHO- RELATED GTP- 
BINDING PROTEIN RHOD 
(RHO- RELATED PROTEIN 
HPl)(RHOHPl)sapiensl 


le-04 


1339 


U74382 


Human telomeric 
repeat DNA-binding 
protein (PIN2) 
mRNA, complete cds 


ie-08 


<NONE> 


<NONE> 


<NONE> 


1340 


L35657 


Homo sapiens 
(subclone H8 5_al0 
from Pi 35 H5 CS) 
DNA sequence. 


le-08 


<NONE> 


<NONE> 


<NONE> 


1341 


L21936 


Human succinate 
dehydrogenase 
flavoprotein subunii 


le-08 


3201678 


(AF060886) adenine 
phosphoribosyltransferase 
[Leishmania tarentolae] 


4.0 


1342 


AB009777 


Homo sapiens gene 
for osteonidogen, 
promoter region 


le-08 


479388 


train - wheat 

> 2 i|39 l929M|PID|dl003454 


2.2 


1343 


M58600 


Human heparin 
cofactor II (HCF2) 
gene, exons I through 
5. 


le-08 


1730173 


GLUCOSE-6-PHOSPHATh 
ISOMERASE, CYTOSOLIC 2 
(GPI) (PHOSPHOGLUCOSE 
ISOMERASE) (PGI) isomerase 
[Ctarkia concinna] 


1.9 


M44 


M58600 


Human heparin 
cofactor II (HCF2) 
gene, exons I through 
5. 


le-08 


1730173 


GL UCOSE-6-PHOSPHATfc 
ISOMERASE, CYTOSOLIC 2 
(GPI) (PHOSPHOGLUCOSE 
ISOMERASE) (PGI) isomerase 
[Clarkia concinna] 


1.7 


1345 


AC000980 


Homo sapiens 
(subclone l_g2 from 
PI H31) DNA 
sequence 


le-08 


439S77 


(L2742S) reverse transcriptase 
(Homo sapiensl 


L.l 
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Nearest Neighbor 'BlastN vs. Gcnbank) 


Nearest Neighbor (Bias tX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1346 


U48734 


Human non- muscle 
alpha-actinin mRNA, 
complete cds 


le-08 


168237 


(M76546) hydroxyproline-rich 
protein [Helianthus annuus] 


0.19 


1347 


M76724 


Human leukocyte 
adhesion receptor 


le-08 


1177607 


(X92485) pval (Plasmodium 
vivax] 


U.17 


1343 


AF067959 


Gallus gallus 
hfimf nrfnmain nrotein 
HOXD-3 mRNA, 
complete cds 


le-08 


• ■ 3165574 


(AF067942) No definition line 
found [Caenorhabditis elegans] 


0.15 


1349 


ZSI014 


Human DNA 
sequence from 
cosmid U65 A4, 
between markers 
DX^16fi and DXSS7 
on chromosome X * 


le-08 


2072964 


sapiens} 


0.001 


1350 


X57103 


Human h-lvs gene for 
lysozyme (upstream 
region) 


7e-09 


<NONE> 


<NONE> 


<cNONE> 


1351 


AF012S99 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


7e-09 


231629 


BILE-SALT-ACTIVATED 
LIPASE PRECURSOR ESTER 
LIPASE) (STEROL 
ESTERASE) (CHOLESTEROL 
ESTERASE) salt-activated 
lipase [Homo sapiens] sapiensl 


0.22 


1352 


L3474! 


Aplysia californica 
prohormone 
convertase (PC2) 
mRNA. complete cds. 


5e-09 


322054 


cytochrome-c oxidase (EC 
1.9.3.1) chain II precursor - 
Synechocystis sp. (PCC 6803) 
>gi|581739 sp.] 


5.0 


1353 


AF052959 


Homo sapiens type 
XV collagen 
(COL15A1) gene, 
exon 6 


4e-09 


131269 


PHOTOSYSTEM II P680 
CHLOROPHYLL A 
APOPROTEIN (CP-47 
PROTEIN) 

>gi|7270S|pir||QJLV6A 
photosystcm II chlorophyll a- 
binding protein psbB - liverwort 
(Marchantia polymorpha) 
chloroplasc >si|i 1700 


1.8 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neichbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 


















L 15470 


Streptomyces 

r* 1 n \/i 1 1 i name [\lcltl 
ClaVUllgCl Ub li^tSJNA- 

3585)ciavulanic acid 
biosynthesis protein 
(cla) gene, complete 
cds and clavaminate 
synthase 2 (cs2) gene, 
partial cds. 


4e^09 


586028 


TAOMATINE 

UREOHYDROLASE) (AUH) 
(PROCLAVAM1NIC ACID 
AMIDINO HYDROLASE) 
>gi|I361423|pir||S57669 
Proclavaminic acid amidino 
hydrolase - Streptomyces 
clavuligerus >gi|295 171 
Proclavaminic acid amidino 
hydrolase [Streptomyces 
clavuligerus] 

>gi|l586122|prfl|2203286B 
proclavaminic acid amidino 
nyaroiase [oirepiomyceb 
clavuligerus] 


4e-l3 


1355 


AB002302 


Human mRNA for 
KIAA0304 gene, 
complete cds 


2e-09 


131600 


PATHWAY PROTEIN L 
product [Klebsiella pneumoniae] 
>gi|1493ll (M32613) pulL 


2.5 


1356 


L34219 


Homo sapiens 
retinaldehyde-oi ndi ng 
protein (CRALBP) 
gene, complete cds. 


le-09 


<NONE> 


<NONE> 


<NONE> 


1357 


AB002302 


Human mRNA for 
KIAA0304 gene, 
complete cds 


le-09 


2224549 


(AB002302) KIAA0304 [Homo 
sapiens] 


5.0 


1358 


D35731 


Homo sapiens 
HSPAlLgene for 

W#»nf chrwt nrnfem 70 

testis variant, 5*UTR. 
partial sequence 


le-09 


1389766 


(U5865S) unknown [Homo 
sapiens] 


1.3 


1359 


AF0644S3 


Homo sapiens natural 
resistance-associated 
macrophage protein 2 
(NRAMP2) gene, 
exon 17. alternatively 
spliced non-IRE 
form, complete cds 


8e-10 


113671 


!!!! ALU CLASS F WARNING 
ENTRY !!!! 


0.72 
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Nearest Neishbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Pro 


teins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1360 


AF002283 


Mus musculus alpha- 
actinin-2 associated 
LIM protein mRNA, 
alternatively spliced 
product, complete cds 


6e-l0 


2996196 


(AF053367) carboxyi terminal 
LIM domain protein [Mus 
musculus] 


4e-21 


1361 


M26220 


African green 
monkey origin of 
replication 


5e-l0 


2143455 


gene DMR-N9 protein - mouse 
(fraement) 


8.8 


1362 


Z78006 


H.sapiens flow-sorted 
chromosome 6 
Hindlll fragment, 
SC6pA7F10 


4e-10* 


2072977 


(U93574) putative pl50 [Homo 
sapiens! 


0.005 


1363 


U82303 


Homo sapiens 
unknown protein 
mRNA, partial cds 


2e-10 


1825711 


(US81S3) similar to the 
immunoglobulin superfarnily t 
most similar to nerual cell 
adhesion proteins 
[Caenorhabditis eleaans] 


0.031 


1364 


AF079764 


Drosophila 
melanogaster 
enhancer of 
polycomb 


2c- 10 


3757890 


(AF079764) enhancer of 
polycomb [Drosophila 
melanosaster] 


lc-10 


1365 


L24123 


Homo sapiens NRFl 
protein (NRF1) 
mRNA. 


2e-10 


3004573 


(AC004520) similar to NFE2- 
related transcription factors; 
similar to 148694 
(PID:g2 137676) [Homo 
sapiens 1 


4c-53 


1366 


M91454 


Orangutan alpha- 
globin gene duplicate 
region. 


lc-10 


464239 


NADH- UBIQUINONE 

4>gi|10851S5|pir||S52968 
NADH dehydrogenase chain 4 - 
honeybee mitochondrion 
(SGC4) >gi|552446 (L06178) 
NADH dehydrogenase subunit 4 
[Apis metlifera ligustica] 


6.0 


1367 


D87U7 


House mouse; 
Musculus domesticus 
brain mRNA for 
SAP 102. complete 
cds 


6e-ll 


473912 


(L3I96H phosphoprotein [Mus 
cookiil 


2.2 


1368 


AC001002 


Homo sapiens 
(subclone 2_h9 from 
PI H39) DN T A 
sequence 


5e-ll 


<NONE> 


<NONE> 


<NONE> 




WO 01/02568 
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SEC 
ID 


Nearest 

1 

ACCESSION 


Neighbor (BlastN vs. 
* DESCRIPTION 


Genbank) 
P VALUE 


Nearest Ncieh 
ACCESSION 


bor (BlastX vs. Non-Redundant P 
DESCRIPTION 


rote ins) 
P VALUE 


1365 


> AC0O1002 


Homo sapiens 
(subclone 2_h9 from 
Pt H39) DNA 
sequence 


5c- 1 1 


<NONE> 


<NONE> 


<NONE> 


1370 


AB007874 


Homo sapiens 
KIAA0414 mRNA. 
partial cds 


5c- 11 J 


<NONE> 


<NONE> 


<NONE> 


1371 


AC001002 


Homo sapiens 
(subclone 2_h9 from 
PI H39) DNA 
sequence 


5c- 11 


<NONE> 


<NONE> 


<NONE> 


1372 


AC001002 


Homo sapiens 
(subclone 2_h9 from 
PI H39) DNA 
sequence 


5c- 11 


<NONE> 


<NONE> 


<NONE> 


1373 


AC001OO2 


Homo sapiens 
(subclone 2_h9 from 
PI H39) DNA 
sequence 


5ell 1 


<NONE> 


<NONE> 


<NONE> 


1374 


AC001002 


Homo sapiens 
(subclone 2 w h9 from 
PI H39) DNA 
sequence 


5c- 11 


<NONE> 


<x\ONE> 


<NONE> 


1375 


221352 


H.sapiens mRNA for 
KERV-K long 
terminal reoeat 


5c- 11 1 


4194S1 


gag polyprotein - human 
endogenous virus S71 


4.6 


1376 


AB007928 


Homo sapiens mRNA 
for KIAA0459 
rote in, partial cds 


5e-ll 


2947238 


.Aj-ud 1782) diaphanous 1 
Homo sapiens! 


2.8 


1377 


1 

DS7117 c 


House mouse: 
Musculus domesticus 
brain mRNA for 
SAP 102, complete 
:ds 


5c- 11 1 


( 

473912 c 


L31961) phosphoprotein [Mus 
:ookii] 


l.S 


1378 


J 

s 
i 

AJ13I501 p 


■iorno Sapiens DNA 
equence between 
wo AMLl gene 
iromoters, 6423 BP 


5e-Il I 


i 

728831 \ 


!!! ALU SUBFAMILY J 
VARNING ENTRY 


0.20 


1379 


I 

T 

M27S26 n 


fuman endogenous 
etroviral protease 
iRNA, complete cds. 


5c- 11 | 


r 

8855S 


etroviral proteinase-Iike protein 
human 


0.002 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












HYPO 1 Hfc lit AL ii.l KD 




1380 


U23804 


Drosophila 

melanogaster putative 
GTP-binding 
regulatory protein 
beta chain (GPB) 
mRNA. partial cds. 


5e-l I 


2494916 


TOP A<iP UkPkAl'^ 

CONTAINING PROTEIN 
TI0F2.4 IN CHROMOSOME 
ill protein; similar to w-Dcia 
repeat region (Trp-Asp 
domains) of guanine nucleotide 
binding protein 


le-JO 


1381 


Z22784 


M.musculus troponin 
I aene. 


3e-U 


3892202 


(AF072889) transcription 
repressor brain factor 2 


0.053 


1382 


AB007880 


Homo sapiens 
KIAA0420 mRNA, 
complete cds 


2e-ll 


<NONE> 


<NONE> 


<NONE> 


1383 


AF020361 


9 Homo sapiens B AX 
gene, exon 6, partial 
sequence 


2e-ll 


<NONE> 


<NONE> 


<NONE> 




L35600 


Homo sapiens DNA 
sequence. 


2e-il 


1174952 


GLYCOPROTEIN D 
rKtLURoUK gU [DOVine 
herpesvirus 11 


0.25 


1385 


U21943 


Human organic anion 

transporting 

polypeptide 


2e-il 


2738223 


(U9501 1) brain-specific organic 
anion transporter 


9e-19 


1386 


U90878 


Homo sapiens 
carboxyl terminal 
LIM domain protein 


2e-ll 


2996196 


(AtUDjjo/; carooxyi terminal 
LIM domain protein [Mus 
musculus] 


4e-23 


1387 


U31929 


Human orphan 
nuclear receptor 
(D AX I) gene, 
complete cds 


6e-l2 


<NONE> 


<NONE> 


<NONE> 


138S 


M25828 


Human von 
Willebrand factor 
gene, exon I, 2, and 
3, and three Alu 
repetitive elements. 


6e-12 


<NONE> 


<NONE> 


<NONE> 


1389 


AB020648 


Homo sapiens mRNA 

*Vi- f i a a no 1 1 
tor MA AUo4 1 

protein, partial cds 


3e-l2 


<NONE> 


<NONE> 


<NONE> 


1390 


Z15026 


H.sapiens genes for 
tumor necrosis factor 
(Tnfa) and 

Ivmphotoxine (Tnfo) 


2e-l2 


<NONE> 


<NONE> 


<NONE> 


1391 


L28101 


Homo sapiens 
kalltstatin (PI4) gene, 
exons 1-4, complete 
cds 


2e-12 


<NONE> 


<NONE> 


<NONE> 


1392 


Z47046 


Human cos mid 
OLL2C9 from Xq2S 


2e-l2 


<NONE> 


<NONE> 


<NONE> 
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Nearest Neiehbor f BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Rsapiens flow-sorted 










1393 


Z79007 


chromosome 6 
Hindlll fragment, 
SC6pA20E2 


2e-l2 . 


106322 


hypothetical protein (L1H 3* 
reaion) - human 


1.5 


1394 


U34377 


Human tyrosine 
kinase TXK (txk) 
gene, exon 13. 


le-I2 


151484 


(M55524) ORF 4; putative 
(Pseudornonas aeruginosa! 


4.3 


1395 


D70845 


Mus musculus apg-1 
gene for novel 
member of heat shock 
protein 110> promoter 
region 


le-I2 


113658 


ALKALINE PROTEINASE 
PRECURSOR (ALr) precursor - 
fungus (Acremonium 
chrysoeenunrO 


3.5 


1396 


M6397S 


Human vascular 
endothelial growth 
factor gene, exon 8. 


le-12 


3982737 


(AF06973l)calmoduiin- 
dependent protein Itinase II beta 
M isoform [Rattus norveaicusl 


0.083 


1397 


U60266 


Homo sapiens 
lysosomal alpha- 
mannosidase (manB) 
mRNA. complete cds 


8c- 1 3 


<NONE> 


<NONE> 


<NONE> 


1398 


Z68297 


Caenorhabditis 
elegans cosmid 
FUA10. complete 
sequence 
[Caenorhabditis 
elegans) 


7e-i3 


2393734 


(AC002542) similar to C. 
elegans Fl IA10.5; 80<& 
similarity to Z68297 
(PIDrgl 130619) [Homo 
sapiens] 


5e-34 


1399 


Z68297 


Caenorhabditis 
elegans cosmid 
FI1A10, complete 
sequence 
[Caenorhabditis 
elegans] 


7e-13 


2393734 


(At.LKj254i) similar to 
elegans F11AI0.5; 80% 
similarity to Z68297 
(PID:gl 130619) [Homo 
sapiens] 


3e-38 


1400 


Z6S885 


Human UNA 
sequence from 
cosmid L21F12B, ' 
Huntington's Disease 
Region, chromosome 
4pl6. 3. contains 
EST. 


6e-l3 


<NONE> 


<NONE> 


<NONE> 


1401 


X76104 


H.sapiens DAP* 
kinase mRNA 


6e-13 


2911154 


(AB007143) ZDP-kinase [Mus 
musculus] 


0.007 


1402 


Z7S668 


H.sapiens flow-sorted 
chromosome 6 TaqI 
fragment. 
SC6pAl3G4 


5e-I3 


106322 


hypothetical protein (L1H 3' 
resion) - human 


2e-06 


1403 


L35600 


Homo sapiens DNA 
sequence. 


3e-13 


3184290 


(AC004136) hypothetical 
protein [Arabidopsis thaliana] 


L7 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor ( BlastX vs. Non- Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Cloning vector 










1404 




pKODT complete 
sequence 




IflTATIO 
JQtOf JU 


(249966) F35CIL4 
[Caenorhabditis clegans] 


7.8 


1405 


D28126 


Human gene tor ATP 
synthase alpha 
subunit, complete cds 
(exon I to 12) 


2e-13 


419481 


gag polyp rotein - human 
endogenous virus S71 


3.4 


1406 


AF005219 


Hnmn <;nnipn^ 

transcription factor 
HOXDI3 


2e-13 


2822166 


(AC004080) transcription factor 
HOXA13 [Homo sapiens] 


5e-09 


1407 


AB018301 


Hnmn tnniprK mRNA 

for KIAA0758 
protein, partial cds 


2e-*13' 


3882237 


(AB018301) KIAA0758 protein 
[Homo sapiensl 


le-23 


1408 


D70845 


Mus muse til us apg- 1 
gene for novel 
member of heat shock 
protein 110, promoter 
reaion 


le-13 


113658 


ALKALINE PROTEINASE 
PRECURSOR (ALP) precursor - 
fungus (Acremonium 
chrysogenum) 


3.1 


1409 


AG000691 


Homo saoiens 
genomic DNA, 21q 
region, clone: 
T17IBG33 


8e-14 


930045 


(X15332) alpha- 1 (III) collagen 
Homo sapiensl 


3e-04 


1410 


D30785 


Mouse mRNA for 

npnrnncin mmnlptf* 
ucuiUfJMll, cunipicic 

cds 


8e-14 


3559978 


ajuujo41) serine protease 
Rattus rattus] 


2e-12 


14M 


U32710 


Haemophilus 

ii 1 1 1 uciiluc rvu ituiivjii 

25 of 163 of the 
complete genome 


8e-14 


4106673 


(AL035064) queuine trna- 
ribosyltransferase 
(Schizosaccharomyces pombe] 


2e-38 


1412 


AG000886 


Homo sapiens 
genomic DNA. 2 Iq 
region, clone: 
64EI1X19 


7e-l4 


1363925 


ivpothetical protein 2 - North 
American opossum (fragment) 
>gi|S9772 1 (Z48955) ORF-2, 
putative RT [Didelphis 
virginiana] 


LI 


1413 


Z62664 


H.sapiens CpG DNA, 
:lone 71 dl I, forward 
read cpg7ldl l.ftla . 


7e-14 


{ 

3953461 


[AC002328) F20N2.6 
Arabidopsis thaliana] 


0.085 


1414 


] 

ABO 14532 | 


iomo sapiens mRNA 
for KIAA0632 
protein, partial cds 


7e-14 


113668 1 


!!! ALU CLASS C WARNING 
ENTRY !!!! 


0.040 
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Nearest Neighbor (BlastN vs. GenbanJO 


Nearest Neiahbor ( BlastX vs. Non-Redundant Pro 


teins) 


ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1415 


Z96478 


H.sapiens telomeric 
DNA sequence, clone 
20PTEL0O4, read 
20PTELOO004.seq 


7e-I4 


2981631 


(ABO 12223) ORF2 [Canis 
familiaris] 


2e-04 


1416 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


4e-14 


<NONE> 


<NONE> 


<NONE> 


1417 


AFO 12899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


4e-14 


<NONE> 


<NONE> 


<NONE> 


1413 


AF033349 


Homo sapiens MLL 
gene breakpoint 
cluster region, intron 
I. partial sequence 


3e-14 


728831 


till it r T C T TT> n \ X A TT V I 

!!!! ALU SUBr AM1L i J 
WARNING ENTRY 


9.3 


1419 


AC001526 


Homo sapiens 
(subclone 4_f6 from 
PI H54) DNA 
sequence 


3c- 14 


9986 [ 


extensin - almond >gi|20420 
(X65718) extensin 


9.2 


1420 


AFO 12899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


3e-14 


728832 


!!!! ALU SUBFAMILY SB 
WARNING ENTRY 


0.15 


1421 


AF 100694 


Mus musculus 
Pomin52 mRNA, 
complete cds 


ze- i** 




EPHRIN-A2 PRECURSOR 
(EPH- RELATED RECEPTOR 
TYROSINE KINASE LIGAND 
6) (LERK-6) sapiens] 
>gi|2924761 (AC004258) 
EPL6 HUMAN [Homo sapiens 


8.7 


1422 


AF012S99 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


9e-15 


119040 


LIB FKUlkiN, SMALL 1- 
ANTIGEN (E IB 19K) 
>gi|74l42|pir||QlAD25 early 
E LB 2 IK protein II - human 
adenovirus 5 >gi|584S9 
(X02996) mRNA 5 first reading 
frame [Human adenovirus type 
5] adenovirus type 5] 
>gi|209797 (J0l969) 2L kD 
protein 


1.5 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












transcription factor GATA-4, 




1423 


AFO 12899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


8c- 15 


477102 


retinoic acid-inducible - mouse 
>gi|293345 (M98339) GATA- 
binding transcription factor 
[Mus musculus] 


0.57 


1424 


ABO 12223 


Canis familiaris LINE 
1 element ORF2 
mRNA, complete cds 


8e-l5 


92385 


hypothetical protein - rat 
(fragment) 


0.003 


1425 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


3c 15 


■ <NONE> 


<NONE> 


<NONE> 


1426 


X12433 


Human pHSl-2 
mRNA with ORF 
homologous to 
membrane receptor 
proteins 


3e-l5 


422532 


collagen alpha 3(IV) chain - sea 
urchin 


8.9 


1427 


r\i\J 1.077 


Sambucus nigra 
ribosome inactivating 
protein precursor 

IlLEVi^rt, w v:M[JIClC vlii 


3e-15 


1353143 


PROBABLH NUCLLAk 
HORMONE RECEPTOR 
E02H1.7 

>gi|3875431|gnl|PID|el344980 
(Z47075) similar to Zinc finger, 
C4 type (two domains) 

ff ir»nnrhnhHi ri<i ele^ansl 


5.0 


1428 


Z69651 


Human DNA 
sequence from 
cosmid L75B9, 
Huntington's Disease 
Region, chromosome 
4pl6.3 


3e-I5 


403460 


(L24521) transformation-related 
protein [Homo sapiens] 


0.60 


1429 


AF012899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


2c- 1 5 


108750 


Ig heavy chain precursor 
(B/MT.4A.17.H5.A5) - bovine 
>gi|440(X62916) anti- 
testosterone antibody [Bos 
taurus] 


1.1 


1430 


X83299 


H.sapiens SMA3 
mRNA 


2e-15 


671530 


(XS3299) SMA3 gene product 
[Homo sapiens! 


0.32 


1431 


U01877 


Human p300 protein 
mRNA, complete cds. 
> :: gb|I62297|I62297 
Sequence I from 
patent US 5655784 


2e-15 


3024341 


El A- ASSOCIATED PROTEIN 
P300 


0.019 



TJlO 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












HYPUI HfcULAL 45. 1 KJJ" - 




1432 


X16516 


Mouse MHC (Qa) Q2 
k gene for class I 
antisen. exons 4-8 


Ie-I5 


2496897 


h'KU lfcliN CloL.iU.b IN 
CHROMOSOME III 
>gi|3874384|gnI|PID|e 1344078 
EST EMBL:C08256 comes 
from this gene; cDNA EST 
EMBL:C09941 comes from this 
gene; cDNA EST yk340al0.3 
comes from this gene; cDNA 
ESTyk340al0.5 comes from 
this gene [Ca.., 


7e-08 


1433 


M74165 


Chicken tensin 
mRNA. complete cds. 


le-15* 


283920 


tensin - chicken >gi|2 12752 
(M74165) tensin 


2e-19 


1434 


X71893 


H.sapiens gene tor 
immunoglobulin 
kappa tight chain 
variable region 04 
and 05 


9e-16 


<N0NE> 


<NONE> 


<NONE> 


1435 


U05227 


Human Rar protein 
mRNA, complete cds. 


9e-16 


3036779 


(Z544/y) match: multiple 
proteins; match: 000407 
Q 12829 P22127 P36861 
Q40219; match: P70550 
Q41022 P22125 Q08155 
P352S6; match: P5 1 14S P5 1 147 
P35293 P36861 P352S9; match: 
P35284 Q40217 P5U52 
P51157 P51 158; match: Q41022 


3e-06 


1436 


M23404 


Chicken erythrocyte 
anion transport 
protein (band3) 
mRNA. complete cds. 


9e-16 


726403 


(U23175) similar to anion 
exchange protein 
Caenorhabditis elegans] 


le-28 


1437 


X 16 143 


Rat mRNA for liver a- 
L-Fucosidase (EC 
3.2.1.51) 


9e-16 


67502 


alpha-L-fucosidase (EC 
3.2.1.51) I precursor, tissue - 
human >gi| 178409 (M29S77) 
alpha-L-fucosidase precursor 
(EC 3.2.1.5) [Homo sapiens] 


2e-29 


1433 


AFO 12899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


8e-16 


<NONE> 


<N'0NE> 


<NONE> 


1439 


AF076981 < 


VJus muse u Ins brain 
mitochondrial carrier 
protein BMCPl 
k Bmcpl) mRNA, 
;omplete cds 


8e-l6 


I 

3851540 ( 


AF078544) brain mitochondrial 
:arrier protein- 1 [Homo sapiens] 


2e-l3 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlasiX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






H.sapiens MN/CA9 






!!!! ALU SUBFAMILY J 




1440 


Z54349 


GENE 


5c- 16 


728831 


WARNING ENTRY 


0.002 


1441 


AF077003 


Mus musculus SH3 
domain-containing 
adapter protein 
mRNA. completers 


3c- 16 


309123 


(M35526) complement 
component C5D [Mus 
musculus] 


3.1 


1442 


X64587 


M. musculus mRNA 
for splicing factor 
U2AF(65kD) 


3c- 16 


2143767 


glycoprotein - rat >gi|986943 
(L08134) glycoprotein [Rattus 
norvesicus] norvegicus] 


0.003 


1443 


AB014561 


Homo sapiens mRNA 
for KIAA0661 
protein, complete cds 


3e-16 


3327136 


(AB014561) KIAA0661 protein 
[Homo sapiensl 


le-20 


1444 


Z739S7 


Human UNA 
sequence from 
cosmidN120B6on 
chromosome 22 
Contains ESTs, 
complete sequence 
[Homo sapiens] 


le-16 


<NONE> 


<NONE> 


<NONE> 


1445 


M533LS 


Homo sapiens ala 
gene. 


le-16 


<NONE> 


<NONE> 


<NONE> 


1446 


U44L03 


Human small GTP 
binding protein Rab9 
mRNA, complete cds 


le-16 


1552584 


(Z80233) hypothetical protein 
Rv0029 


1.3 


1447 


AB0L456I 


Homo sapiens mRNA 
forKIAA0661 
protein, complete cds 


9e-17 


3327136 


(AB014561) KIAA0661 protein 
[Homo sapiens] 


2e-20 


144S 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


lc-17 


<NONE> 


<NONE> 


<NONE> 


1449 


M76762 


Mus musculus 
ribosomal protein (Ke 
3) gene, exons 1 to 5. 
and complete cds. 


le-17 


1073048 


pupR protein - Pseudomonas 
putida >gi|525260 


0.36 


1450 


D50561 


Human DNA. 
replication enhancing 
element (REED 


4c- 18 


126295 


LINE- 1 REVERSE 
TRANSCRIPTASE 
HOMOLOG 


0.78 


1451 


D 1643 1 


Human mRNA for 
hepaioma-derived 
growth factor » 
complete cds 


4e-18 


3242079 


[AJ0069S4 1 ) proline-rich protein 


0.0 IS 
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Nearest Neighbor fBlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
LD 


ACCESSION 


1 DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1452 


AF088983 


Mus musculus heat 
shock protein hsp40-^ 
mRNA, complete cds 


4c- 18 


3873707 


(Z73102) Similarity to B.subtilis 
DNAJ protein 

(SW:DNAJ_BACSU); cDNA 
EST yk437al.5 comes from this 
gene [Caenorhabditis elesans] 


9e-25 


1453 


U60205 


Human methyl sterol 
oxidase (ERG25) 
mRNA. complete cds 


3e-18 


<NONE> 


<NONE> 


<NONE> 


1454 


AF038177 


Homo sapiens clone 
23899 mRNA 
sequence 


le-18 


1360775 


G protein-coupled receptor 74 - 
equine herpesvirus 2 >gi|695246 
(U20824) G protein-coupled 
receptor [Equine herpesvirus 2] 


5.1 


1455 


ABO 14561 


Homo sapiens mRNA 
for KIAA0661 
protein, complete cds 


le-18 


3327136 


(ABO 14561) KIAA0661 protein 
[Homo sapiens] 


lc-2l 


1456 


AB014561 


Homo sapiens mRNA 
for KIAA0661 
protein, complete cds 


le-18 


3327136 


(AB014561) FCIAA0661 protein 
Homo sapiens] 


le-22 


1457 


U34374 


Human tyrosine 
kinase TXK (txk) 
eene. e.xons 9 and 10. 


le-19 


<NONE> 


<NONE> 


<NONE> 


1458 


AB006969 


Homo sapiens 

kf,A A 1 mD V J. 

nUnAl mr\..N rV, 

complete cds 


le-19 


4151809 


(AF1028^d) synaptic SAPAP- 
interacting protein Synamon 


0.19 


1459 


AB002293 


Human mRNA for 
KIAA0295 gene, 
partial cds 


le-19 


2224531 . 


(AB002293) KIAA0295 [Homo 
sapiens] 


6e-l7 


1460 


259664 


H.sapiens CpG DNA, 
clone 16S19, reverse 
read cpgl 6Sf9.rtla . 


5e-20 


3880251 ( 


282055) predicted using 
Senefinder 


6.5 


1461 


i 

( 

M73837 i 


Human modulator 
recognition factor 2 
iMRF-2) mRNA, 
:omp!ete cds. 


5e-20 


r 

284313 \ 


nodulator recognition factor 2 - 
luman factor 2 [Homo sapiens] 


0.019 



^7 3 
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Nearest 


Neighbor (BlastN vs. Genbank) 


_ Nearest Neighbor (BlastX v S . Non-Redundant Proteins! 


SEQ 
ID 


access ro 


r DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1462 


U24267 


Human pyrroIine-5- 

carboxylate 

dehydrogenase 


5e-20 


2506350 


" DEL 1 A-l-HY KKOLU\h-5- 

CARBOXYLATE 

DEHYDROGENASE 

PRECURSOR (P5C 

DEHYDROGENASE) 

>gi| 1353248 sapiens] 

>gi| 1353250 (U24267) pyrroline 

5-carboxylate dehydrogenase 

[Homo sapiens] 

>gi|1589585|prfI|22H355A 

Delta I-pyrroline-5-car boxy late 

dehydroeenase [Homo sapiens] 


5e-04 


1463 


U I 3262 


iVIus musculus myelin 
gene expression 
factor 


4e-20 


536926 


(U 13262) myelin gene 
expression factor [Mus 
musculus] 


3e-07 


1464 


U 1 3262 


Mus musculus myelin 
gene expression 
factor 


4e-20 


3126878 


(AF061S32)M4 protein 
deletion mutant [Homo sapiens] 


le-08 


1465 


261239 


H.sapicns CpG DNA. 
clone 48fl0. forward 
read cpg4Sfl0.ftla . 


4e-20 


1669601 


(D8S747) AR401 [Arabidopsis 
thaliana] 


8e-l9 


1466 


US9915 


Mus musculus 
junctional adhesion 
molecule (Jam) 
mRNA. complete cds 


le-20 




(US99 15) junctional adhesion 
molecule [Mus musculus] 


7e-I 1 


1467 


AF02907I 


Gallus gallus p52 pro- 
apototic protein 
mRNA. complete cds 


7e-22 


2599492 


(AF029071) p52 pro-apototic 
protein [Gallus eallus] 


le-l-5 


1468 


• 

1 

M25636 i 


Figure 4. Nucleotide 
sequence of the 
oKS36 1.797 kb 
nsert. 


6e-22 


( 

1196398 


M21305) unknown protein 
Homo sapiens] 


0.65 


1469 


1 

f 

AB020655 f 


-lomo sapiens mRNA 
or KIAA0S4S 
jrotein. complete cds 


6e-22 


( 

4240325 f 


AB020725) KIAA091S protein 
Homo sapiens] 


le-19 
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■ 1 Nearest Neighbor fBlastN vs. Genbank) 


Nearest Neighbor (Bias tX vs. Non-Redundant ProteinO 


SEQ 

tD | ACCESSIONS 


I DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 










rKULULLAGL.S ALPHA 




1470 S80935 


chorionic 

gonadotropin beta 1 
(CG beta O subunit 


5e-22 


115310 


1(1 W CHAIN PRECURSOR" 
>gi|849l7|pir||A31S93 collagen 
alpha 1(IV) chain precursor - 
fruit fly (Drosophila 
melanogaster) me lanog aster] 
>gi|I57078 (M96575) type IV 
collagen pro-collagen 
[Drosophila melanosaster] 


0.027 


14711 AF053066 


Homo sapiens 
microsatellite 
D5S2926 sequence 


2e-22 


728831 


!!!! ALU SUBFAMILY J 
WARNING ENTRY 


3e-04 


1472 U55I77 


Danio rerio carbonic 
anhydrase homolog 
CAH-Z mRNA. 
complete cds 


2e-22 


3123190 


CARBONIC ANHYDRASE 
(CARBONATE 

DEHYDRATASE) >gij2576335 
(U55 177) CAH-Z [Danio rerio] 


5e*l4 


1473 AF064250 


Gallus gall us 
ubiquitin specific 
protease 66 


2e-22 


2736064 


(AF016107) ubiquitin specific 
protease 41 [Gallus aallus] 


7e-37 


1474 AF030S80 


Homo sapiens 
pendrin(PDS) 
mRNA, complete cds 


2e-22 


729367 


UKAPKUlhli\\DOW:V 
REGULATED IN ADENOMA) 
>gi|2l35020|pir||A47456 down- 
regulated in adenoma (DRA) - 
human >gi|29l964 (L027S5) 
Nuclear localization signal at 
AA 569-573, 576-580, 579-583; 
acidic transcr. activ. domain 620 
640,; homeobox motif 653-676 
Homo sapiens] 


4e-53 


U75 AF 100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


6e-23 


<NONE> 


<NONE> 


<NONE> 


1476 X57398 


Human- rriRNA for 
pM5 protein 


3e-23 


107350 


Pm5 protein • human 
>gi|1335273|»n!|PID|e36:4l 


le-04 


1 

1477 AB010998 


Rattus norvegicus 
PAD-R11 mRNA for 
3 epiidylarginine 
Jeiminase type I, 
:6mplete cds 


2e-23 


<NONE> 


<NONE> 


<NONE> 


1478 D10871 ; 


ttuman h NAT allele 
2-2 gene for 
irylamine N- 
icetvltransferase 


2e-23 


( 

171200 


J04734) CDC6 protein 
Saceharomvces cerevisiae| 


9.S 


I 

1 a 

U79| D10S71 a 


-luman h NAT allele 
!-2 gene for 
irylamine N- 
cetvl transferase 


2e-23 


( 

171200 r 


J04734) CDC6 protein 
Saccharomyces cerevisiae] 


S.3 



vis 



WO 01/02568 
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Nearest Neighbor (BlastN vs. Genbank) 



SEQ 

id |acces sion | DESCRIPTION 
{Homo sapiens MLL- 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins \ 



P VALUE | ACCESSION 



DESCRIPTION 



AF4 fusion protein 
14801 AF024541 [mRNA. partial cds 



2e-23 I 2136142 



Human AF-4 mRNA, | 
I4 St| LI 3773 (complete cds. 



2e-23 I 3063962 



I P VALUE 



serine/proline- rich FEL protein, 
splice form 1 - human 



le-20 



(AF031404) MLL-AF4 fusion 
protein [Homo sapiens] 



Mus musculus 
|Pomin52 mRNA, 
1482 J AF1QQ694 complete cds 



8e-24 



<NONE> 



<NONE> 



[<NONE> 



Drosophila 

melanogaster Rga and) 
Am' genes, complete 
1483| U75467 |cds 



8e-24 



1658503 



(U75467) Atu [Drosophila 
melanogaster] | 2e-37 



1484 1 D 17076 



Human HepG2 partial} 
cDNA, clone 
Ihmd5a09m5 



7e-24 



<NONE> 



<NONE> 



Mus musculus 
Pomin52 mRNA, 
14851 AF10Q694 [complete cds 



7e-24 



1169643 



l-MW-AMiDd-RELATgD 
NEUROPEPTIDES 
PRECURSOR >gi|4I6208 
(U03L37) neuropeptide 
precursor FMRFamide-related 
ide [Lymnaea stagnalis] 



| <NONE> 



7e-10 



Human 28S 
14861 Ml 1 167 Iribosomal RNA gene. 



2e-24 



Mus musculus 
Pontin52 mRNA, 
14871 AF100694 (complete cds 



3875481 



(281054) predicted using 
Gene finder; Similarity to UDP- 
giucoronosvl transferases 



5.1 



2e-24 



Cloning vector 
pAP3neo DNA. 
1488 1 AB003468 jcompjete sequence 



549173 



USPl PROTEIN PRECURSOR | 
>sij 169623 



1.2 



2e-24 



1489| X0354I 



[Human mRNA of trk 
[oncogene > :: 
gb|I96186|I96l86 
Sequence 23 from 
Ipatent US 5734039 



987050 



(X65335) lacZ gene product 
unidentified cloning vector! 



0.05$ 



2e-24 



1490| L81652 



Homo sapiens 
(subclone 2_g 1 1 fro ml 
Pi H43) DNA 
[sequence 



325465 



(M74509) [Human endogenous 
retrovirus type C oncovirus 
sequence.], gene product [Homo| 
sapiens] 



3e-04 



2e-24 



1491 



IDrosophiJa 
melanogaster 
I strawberry notch 
(snoj mRNA, 
_U?5760 complete cds 



225047 



[Mus musculus 
Pontin52 mRNA, 
1492[ AF 100694 [complete cds 



2e-24 



2078282 



reverse transcriptase related 
protein [Homo sapiens] 



4e-l2 



(U95760) Sno [Drosophila 
melanogaster] 



2e-41 



8e-25 



2623773 



(AFO04S35) tyrocidine 
synthetase 3 [Brevibacillus 
brevis] 



121 
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Nearest Neighbor (BlascN vs. Genbank) 


Nearest Neishbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1493 


AB002405 


Homo sapiens mRNA 
for LAK-4p, 
complete cds 


8e-25 


2496822 


HYPOTHETICAL 127.3 KD " 
PROTEIN B0416 1 IN 
CHROMOSOME X >gi|746502 
(U23516) B0416.1 gene product 
[Caenorhabditis eleaans] 


9e-ll 


1494 


K03002 


Human mRNA from 
chromosome 15 gene 
with homology to 
MHC-HLA-SB-1 
intron A. 


8e-25 


1514614 


(X92842) nuclear protein (Mus 
musculus] 


le-13 


.1495 


U6I232 


Human tubulin- 
folding cefaclor E 
rnRNA, complete cds 


7e-25 


1465772 


(U61232) cofactor E [Homo 
sapiens) 


2e-05 


1496 


U 10245 


Arabidopsis thaliana 
Col-0 putative RNA 
helicase A mRNA, 
comolete cds 


5e-25 


i JJJiJ7 


(U10245) putative RNA 
helicase A [Arabidopsis 

Inn 1 1 m n 1 1 

inaiiuna| 


ieo / 


1497 


XS92U 


H.sapiens DNA for 
endogenous retroviral 
like element 


3e-25 


2065210 


/'YP7I i\ Pro Pnl HTFTPn:*" 
( i i i U) rro-roi-GU i rase 

polyprotein 


5e-06 


1493 


L81652 


Homo sapiens 
(subclone 2 all from 
PI H43) DNA 
sequence 


3e-25 


2072961 


(U93568) putative pI50 [Homo 
sapiens 1 


5e-l6 


1499 


XS2S95 


H.sapiens mRNA for 
DLG2 


2e-25 


2497511 


MAG UK P55 SUBFAMILY 

(DISCS, LARGE HOMO LOG 
2) 


Je-34 


1500 


M36654 


Mouse homeo box 
2.6 (Hox-2.6) mRNA, 
complete cds. 


9e-26 


3323169 


(AE001255)T. pallidum 
predicted coding resion TP0854 


1.9 


1501 


L36315 


Mus musculus (clone 
pMLZ-l) zinc finger 
protein 


9e-26 


1806134 


(Z67747) zinc finger protein 
Mus musculus] 


4e-05 


1502 


AB018281 


Homo sapiens mRNA 
for KIAA0738 
Drotein. complete cds 


9e-26 


72SS31 


!!! ALU SUBFAMILY J 
WARNING ENTRY 


le-07 


1503 


] 
I 

AFO 17433 1 


Homo sapiens 
mutative transcription 
'actor CR53 


9e-26 


32199S5 : 


ZINC FINGER PROTEIN ZFP- 

:9 


. ie-17 




WO 01/02568 
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Nearest Neighbor f BlastN vs. Genbank) 



SEQj 

rD 



ACCESSION 



DESCRIPTION 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



P VALUE | ACCESSION 



DESCRIPTION 



P VALUE 



Homo sapiens 



1504 1 AC001225 



(subclone 2_e6 from 
BAC H94) DNA 
seq uence 



8e-26 



2653713 



(U91823) smallS protein 
[Hepatit is B virus] 



4.3 



1505 | AF100694 



Mus muscuJus 
Pontin52 mRNA, 
complete cds 



8e-26 



283446 



cyteine-rich surface antigen 72, 
CRP72 - Giardia lamblia 
(fragment) 



3.4 



15061 X949 12 



H. sapiens Pr22 gene 



3e-26 



728837 



!!!! ALU SUBFAMILY SQ 
WARNING ENTRY 



4e-09 



1507 | AF10Q694 



Mus musculus 
Pontin52 mRNA, 
complete cds 



2e-26 



<NONE> 



<NONE> 



<NONE> 



1508} U441Q3 



Human small GTP 
binding protein Rab9 
mRNA. complete cds 



Ie-26 



3327038 



(AB0145I2) KIAA0612 protein 
Homo sapiens 



iapti 

Ae'bUDyyU) Contains repeated 
region wiih similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb(234165 and -b|2l8788 
come from this gene. 
Arabidopsis thaliana] 



8.7 



1509 1 AF 100694 



Mus musculus 
Pontin52 mRNA, 
complete cds 



9e-27 



4056454 



0.14 



15I0| AG001212 



Homo sapiens 
genomic DNA, 21q 
region, clone: 
9HIIN46 



9e-27 



126296 



LINE-1 REVERSE 
TRANSCRIPTASE 
HOMOLOG protein 
[ Nye t ice bus coucang] 



0.012 



1511 1 AF02713I 



Mus musculus mucin 
glycoprotein MUC3 
mRNA, partial cds 



9e-27 



2589172 



(U76551) mucin Muc3 [Rattus 
norvesicus] 



2e-14 



i5i: 



U49057 



Rattus norvegicus 
CTD-binding SR-like 
protein rA9 mRNA, 
complete cds 



5e-27 



1438534 



(U49057) rA9 [Rattus 
norvegicus] 



le-04 



1513 | JQ3764 



1514| Z7S160 



1515| Z64210 



Human, plasminogen 
activator inhibitor- 1 
gene, exons 2 to 9. 



3e-27 



M. musculus partial 
cochlear mRNA 
(clone 2SD2) 



<NONE> 



3e-27 



1490362 



H.sapiens CpG DNA, 
clone 99b4, reverse 
read cpg99b4.rtla . 



3e-27 



2257538 



<NONE> 



(Z7S160) unknown [Mus 
musculus) 



(AB00453S) LIPOIC ACID 
SYNTHETASE 
PRECURSOR(LIP-SYN') 
[Schizosaccharomyces pombe) 



<NONE> 



2e-05 



le-06 



WO 01/02568 
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Nearest Neighbor fBlastN vs. Genbank) 


Nearest Neighbor fBlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Homo sapiens 










1516 


L35659 


(subclone H8 6_h6 
from PI 35 H5 CS) 
DNA sequence. 


le-27 


<NONE> 


<NONE> 




1517 


AF 100694 


Mas musculus 
Pomin52 mRNA, 
cnmnlete cds 


le-27 


1644471 


(U72686) odorant receptor 4 

f Dnntri rprtfil 


/ .J 


1518 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-27 


2738388 


(AF003534) hypothetical 
protein 004L [Chilo iridescent 
virus] 


7 


1519 


AB00927 1 


Homo sapiens gene 
for BCNT oartial cds 


le-27 


3880909 


(AL032636) Y40B1B.3 

[^UCIIUl llUUUILli ClCUtilloJ 




1520 


AF 100694 


Mus musculus 
Pomin52 mRNA, 

comnlete cds 


le-27 


2133579 


sperrnatophorin Sp23 - yellow 

nriPiKvrtrm mnlifnrl 


n ss 


1521 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-27 


121805 


ENDOGLUCANASE A 
PRECURSOR 


0.58 


1522 


AF100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


le-27 


3722000 


(AF035323) survival motor 
neuron protein [Bos taurus] 


0.10 


1523 


AFI00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-27 


3323188 


(AF074902) laminin alpha chain 
[Caenorhabditis elegans] 


0.083 


1524 


AF0743S2 


Homo sapiens IkB 
kinase gamma subunit 


le-27 


3641280 


(AF074382) IkB kinase gamma 
subunit [Homo sapiens] 


0.041 


1525 


AF 1 00694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


le-27 


4056454 


(AC005yyU) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z 13788 
come from this gene. 
[Arabidopsis thaliana] 


6e-04 


1526 


L78778 


Homo sapiens 
(subclone 2_e 10 from 
PI H49) DNA 
sequence 


le-27 


225047 


reverse transcriptase related 
protein [Homo sapiens] 


2e-09 


1527 


L03427 


Human zinc finger 
protein basonuclin 
mRNA. complete cds. 


le-27 


1488275 


[U59694) zinc finger protein 
basonuclin [Homo sapiens] 


9e-22 


152S 


U09954 ( 


Human ribosomal 
protein L9 gene. 5* 
region and complete 
:ds. 


4e-2S 


2257538 


(AB00453S) LIPOIC ACID 
SYNTHETASE 
PRECURSOR(LIP-SYN) 
Schizosaccharomyces pombe] 


2e-04 



WO 01/02568 
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Nearest Neighbor f BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Protein^ 


SEQ 
ID 


ACCESSION 


/ DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1529 


Z64210 


H.sapiens CpG DNA, 
clone 99b4. reverse 
read cpg99b4.rtla . 


4e-28 


3878570 


(^46380 similar to tipoic acid 
synthase; cDNA EST yk283b6.: 
comes from this gene; cDNA 
EST yk283b6.5 comes from this 
gene; cDNA EST yk472f5.3 
comes from this gene; cDNA 
EST yk472f5.5 comes from this 
gene; cDNA EST yk476e7.3... 


} 

7c- 11 


1530 


U55177 


Danio rerio carbonic 
anhydrase homolog 
CAH-Z mRNA, 
complete cds 


4e-28 


3123190 


CARBONIC ANHYDRASE 
(CARBONATE 

DEHYDRATASE) >gi|2576335 
(U55177) CAH-2 [Danio reriol 


5e-2I 


153 i 


D43682 


Human mRNA for 
very-long-chain acyl- 
CoA dehydrogenase 
(VLCAD), complete 
cds 


4e-28 


1351839 


ACYL-COA 

DEHYDROGENASE, VERY- 
LONG-CHAIN SPECIFIC 
PRECURSOR (VLCAJD) 
>gi|930358 taurus] 


3e-27 


1532 


AFO 16591 


Homo sapiens 
survival motor neuron 
pseudogene. complete 
sequence 


3e-2S 


728831 


!!!! ALU SUBFAMILY J 
WARNING ENTRY 


3e-08 


1533 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


2e-28 


728832 


!!»! ALU SUBFAMILY SB 
WARNING ENTRY 


2.5 


1534 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


2e-28 


118588 


DEHYDRIN DHN3 
>gi| l00O35|pir||S IS 139 dehydrin 
DHN3 - garden pea >gi[20709 
(X63063) pea dehydrin DHN3 
Pisum sativum] 


0.004 


1535 


AF 1 00694 t 


VIus musculus 
Pontin52 mRNA, 
romplete cds 


2e-28 


*' 

] 
( 
I 

1169643 | 


rivlKJ- AiVliDh-KLLA I £D 
[NEUROPEPTIDES 
PRECURSOR >gi|416208 
'U03137) neuropeptide 
precursor FMRFamide- related 
peptide [Lvmnaea staenalis] 


6e-04 


1536 


r 
I 

AF 1 00694 c 


vlus musculus 
5 ontin52 mRNA, 
omplete cds 


2e-28 


( 
r 

i 

I 
c 

4056454 


ALOOoWO) Contains" repeated 
egion with similarity to 
ib|U43627 extensin (atExtl) 
»ene from Arabidopsis thaliana. 
:STs gb|Z34165 and gb|Zl878S 
ome from this gene. 
Arabidopsis thaliana] 


9e-05 




WO 01/02568 
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SEQ 
ID 


Nearest N 
ACCESSION 


eighbor (BlastN vs. <Je 
DESCRIPTION 


nbank) 
P VALUE 


Nearest Neishbo 
ACCESSION 


r (BlastA vs. rson-Keaunaant rro 

DESCRIPTION 
"XCOOiyyO) Contains repeated 


eins) 

P VALUE 


1537 


1 

AF100694 


vtus musculus 
Pontin52 mRNA, 
complete cds 


2e-28 


i 

] 
] 

1 

4056454 


egion with similarity to 
£b|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
;STs gb|Z34l65 and gbiZ 18788 
:ome from this gene. 
[Arabidopsis thaliana] 


2e-06 


1538 


AF100694 


Miis musculus 
Pontin52 mRNA, 
complete cds 


2e-28' 


4056454 


AC005$>0) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34l65 and gbjZ18788 
come from this gene. 
[Arabidopsis thaliana] 


2e-09 


1539 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


2e-28 


4056454 


(ACUOyJOO) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb<Z 18788 
come from this gene. 
[Arabidopsis thaliana] 


lc-09 


1540 


AF 100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


2e-28 


4056454 


(ACOOSy^O) Contains repeatea 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gbjZ 18788 
come from this gene. 
[Arabidopsis thaliana! 


5e-l0 


1541 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


2e-28 


4056454 


(ACOtoyyOj Contains repeatea 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gbjZl878S 
come from this gene. 
[Arabidopsis thaliana] 


le-11 


1542 


AF 100694 


Mus musculus 
Pomin52 mRNA, 
complete cds 


2e-28 


3157926 


(AC0O213 1) Strong similarity to 
extensin-like protein gb|Z34465 
from Zea mays. [Arabidopsis 
thaliana] 


oe- 1_ 


1543 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1544 


AF 1 00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1545 


AF 100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 



is \ 



WO 01/02568 
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Nearest Neighbor ( BlastN vs. Gcnbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Mus musculus 










1546 


AF100694 


PoniiniZ mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1547 


AF100694 


Mus musculus 
Pomin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1543 


AF 1 00694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


le-28 


<NONE> 


<NONF> 


<NONE> 


1549 


AF 100694 


Mus musculus 
Pontirt52 mRNA, 
complete cds 


le-28 


' <NONE> 


<NONE> 


<NONE> 


1550 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONH> 


1551 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONTE> 


1552 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1553 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1554 


AF 1 00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1555 


AF 1 00694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1556 


AF 100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1557 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1558 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1559 


AF 1 00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1560 


AF 100694 


Mus musculus 
Pontiro: mRNA. 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONt> 


1561 


AF 100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONZ> 



73v 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ED 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Mus musculus 










1562 


AF100694 


Pontin52 mRNA, 

pnmnlptp f*H^ 


le-28 








1563 


AF 1 00694 


Mus musculus 
Pontin52 mRNA, 

cnmnlpfp a\^, 


le-28 


<NONE> 




-"WONT?-* 


L564 


API 00694 


Mus musculus 
Ponrin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


.1565 


AF 1 00694 


Mus musculus 
Pomin52 mRNA, 
complete cds 


le-28 


. <N0NE> 


<NONE> 


<NONE> 


1566 


M87708 


Human simple repeat 
polymorphism. 


le-28 


<NONE> 


<NONE> 


<NONE> 


1567 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


<N0NE> 


<NONE> 


<NONE> 


1568 


AF I 00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


3924779 


^MJLWJOJOJ^ M1II11JJ LU tJllllIIIII 

B; cDNA EST yk450d8.5 comes 
from this gene; cDNA EST 
yk249a6.5 comes from this 
gene; cDNA EST yk219a2.5 
comes from this gene; cDNA 
EST yk355e4.5 comes from this 
gene; cDNA EST yk224f4.5 
comes fr... 

>gi|39248S I |gnI|PID|e 1 354569 
from this gene; cDNA EST 
yk249a6.5 comes from this 
gene; cDNA EST yk219a2.5 
comes from this gene; cDNA 
EST yk355e4.5 comes from this 
gene; cDNA EST yk224f4.5 
comes from... 


3.0 


1569 


AF 1 00694 


VIus musculus 
Pontin52 mRNA, 
;omplete cds 


le-28 


1 169643 


FMRFAMI DE-RELATED 
NEUROPEPTIDES 
PRECURSOR >ei|4 16208 
!U03137) neuropeptide 
precursor FMRFamide-related 
peptide [Lymnaea stagnalisl 


0.66 




WO 01/02568 



PC17US0O/18374 





Nearest Neighbor f BlastN vs. Gcnbank) 


Nearest Neiahbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












\.*VDuUUJUJJ 3IIIUUU LU UlllimiLl 




1570 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-23 


3924779 


B, iDNA E3T yK4j0d3.j lumeb 
from this gene; cDNA EST 
yk249a6.5 comes from this 
gene, ci-'iNA Co i yK^iyaz.j 
comes from this gene; cDNA 
EST yk355e4.5 comes from this 
gene; cDIN A ho l yici^w.j 
comes fr... 

>gi|3924881|gnl|PID|el354569 
from this gene; cDNA EST 
yk249a6.5 comes from this 
gene; cDNA EST yk219a2.5 
comes from this gene; cDNA 
EST yk355e4.5 comes from this 
gene; cDNA EST yk224f4.5 
comes from... 


0.65 


1571 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


2133579 


spermatophorin Sp23 - yellow 
mealworm molitorl 


0.49 


1572 


AF10O694 


Mus musculus 
Pontin52 mRNA t 
complete cds 


le-28 


2133579 


spermatophorin Sp23 - yellow 
mealworm molitorl 


0.49 


1573 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


283446 


cyteine-rich surrace antigen 72, 
CRP72 - Giardia Iambi ia 
(fragment) 


0.45 


1574 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


2498937 


SPERMATOPHORIN SP23 
PRECURSOR mealworm 
>gi|[ 61725 (M92928) structural 
protein 


0.33 


1575 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


1492050 


(U60315) MCI07L [Molluscum 
contagiosum virus subtype 1] 


0.1S 


1576 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


2133579 


spermatophorin Sp23 - yellow 
mealworm molitor] 


o.oss 


1577 


AFl 00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


113588 


DEHYDRIN DHN3 
>gi( 100035 jptrllS L 8 139 dehydrin 
DHN3 - garden pea >gi|20709 
(X63063) pea dehydrin DHN3 
[Pisum sativum] 


0-0 IS 


157S 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


i 18588 


DEHYDRIN DHN3 
>gi|10OO35|pir||SlS139 dehydrin 
DHN3 - garden pea >gi|20709 
(X63063) pea dehydrin DHN3 
[Pisum sativum] 


0.016 




WO 01/02568 



PCTYUS00/18374 





Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins^ 


SEQ 
ID 


ACCESSION 


1 DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












DEHYDRIN DHN3 




1579 


AF100694 


Mus muse til us 
Poniin52 mRNA, 
complete cds 


le-28 


118588 


>gi|10OO35|pirj|S 18139 dehydnr 
DHN3 - garden pea >gi|20709 
(X63063) peu dehydrin DHN3 
[Pisum sativum] 


l 

v.wU 


1580 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


. 4056454 


(ALUO;>S>yO) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z 18788 
come from this gene. 
[Arabidopsis thalianal 


0.010 


1581 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


118538 


DEHYDRIN DHN3 
>gi|I00035|pir||S 18139 dehydrin 
DHN3 - garden pea >gi|20709 
(X63063) pea dehydrin DHN3 
[Pisum sativuml 


0.002 


1582 


AF 1 00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


1169643 


r-'.Mk^AMIDS-RELATED 
NEUROPEPTIDES N 
PRECURSOR >gi|4l620S 
(U03137) neuropeptide 
precursor FMRFamide -related 

Oentiut* fl vmn:ip;i <;rnon"ilicl 




1583 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(ALUU^yyU) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34!65 and gbJZ 18788 
come from this gene. 
[Arabidopsis thaliana] 


0.002 


1584 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


118588 


DEHYDRIN DHN} 
>gi| I00035|pi r||S 18139 dehydrin 
DHN3 - garden pea >gi|20709 
(X63063) pea dehydrin DHN3 
Pisum sativum] 


0.002 


1585 


] 
] 

AF 100694 i 


Vfus musculus 
3 omin52 mRNA, 
romplete cds 


le-2S 


4056454 


lALUODVyU) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
»ene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gbjZ 18788 
;ome from this gene. 
Arabidopsis thaliana] 


o.oo: 


1586 


r 
i 

AF 100694 c 


vlus musculus 
>ontin52 mRNA, 
omplete cds 


le-2S 


] 
( 

118588 


DEHYDRIN DHN3 
>gi|100035jpir||SlS139 dehydrin 
3HN3 • garden pea >gi|20709 
X63063) pea dehydrin DHN3 
Pisum sativum] 


0.00 1 



WO 01/02568 



PCT/US00/18374 





Nearest 


Neighbor (BlastN vs. Genbank) 


Nearest Neighbor ( BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


\ DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












(ACOODyyu; Contains repeated 




1587 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|234!65 and gb|2 18788 
come from this gene. 
[Arabidopsis thalianal 


0.001 


1588 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(ACQ05990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|2l8788 
come from this gene. 
[Arabidopsis thaliana] 


6e-04 


1589 


AF100694 


Mus musculus 
Pomin52 mRNA, 
complete cds 


le-28 


4056454 


(ACOOoyyO) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|234l65 and gb|Z18788 
come from this gene. 


je*U4 


1590 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(ACUO^yyO) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from .Arabidopsis thaliana. 
ESTs gb|234165 and gb|2 18788 
come from this gene. 




1591 


AF 100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


le-28 


118588 


DEHYDRIN DHN3 
>gi| 1 00035|ptr||S 18139 dehydrin 
DHN.3 - garden pea >gi|207O9 
(X63063) pea dehydrin DHN3 
Pisum sativum! 


2e-04 


1592 


J 

AF 1 00694 ( 


vlus musculus 
Pomin52 mRNA, 
romplete cds 


le-28 


4056454 


ALUU^yyU) Contains repeated 
region with similarity to 
sb|U43627 extensin (atExtl) 
zene from Arabidopsis thaliana. 
ESTs gb|234165 and gb|2 18788 
:ome from this gene. 
Arabidopsis thalianal 


2e-04 


1593 


j 
I 

AF100694 c 


^Ius musculus 
5 ontin52 mRNA, 
omplete cds 


le-28 


r 

I 
\ 
I 
c 

4056454 f 


ALUUD9y0; Contains repeated 
egion with similarity to 
fbjb'43627 extensin (atExtl) 
lene from .Arabidopsis thaliana. 
ESTs gb|Z341 65 and gbjZ 18788 
ome from this gene. 
Arabidopsis thaliana) 


5e-05 



WO 01/02568 



PCTAJS00/18374 



Nearest Neighbor fBlastN vs. Genbank) 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



SEQ 
ID 



ACCESSION 



DESCRIPTION 



P VALUE 



ACCESSION 



DESCRIPTION 



P VALUE 



1594 



1595 



1596 



AF 100694 



AF100694 



Mus muse ul us 
Pontin52 mRNA. 
complete cds 



Mus musculus 
Pontin52 mRNA, 
complete cds 



AF 100694 



Mus musculus 
Pontin52 mRNA, 
complete cds 



le-28 



le-28 



le-28 



4056454 



(ALUUDVVO) Contains repeated 



region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|234165 and gb|218788 
come from this gene. 
(Arabidopsis thaliana] 



4056454 



Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|218788 
come from this gene. 
[Arabidopsis thaliana) 
(ACUU599C 



5e-05 



4056454 



ALUU0990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana 
ESTs gb(234 i 65 and gb|2 18788 
come from this gene. 
[Arabidopsis thaliana] 



uops... , 

Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34 165 and gb|Zl8788 
come from this gene. 

[ Ara bi dops i s tha I i a na ] 

(ACUO^yyO) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z 18788 
come from this gene. 
Arabidopsis thaliana] 



le-05 



le-05 



1597 



AF 1 00694 



Mus musculus 
Pontin52 mRNA. 
complete cds 



le-2S 



4056454 



9e-06 



1598 



AF100694 



Mus musculus 
Pontin52 mRNA, 
complete cds 



le-28 



4056454 



1599 



AF100694 



Mus musculus 
Pontin52 mRNA. 
complete cds 



le-28 



4056454 



(ALU03990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34l65 and gb|ZlS78S 
come from this gene. 
[Arabidopsis thaliana] 



6e-06 



5e-06 



1600 



AF1 00694 



Mus musculus 
Pontin52 mRNA, 
complete cds 



le-28 



544357 



RNA-BINDIXG PROTEIN 
FUS/TLS protein [human. 
Peptide. 526 aa] [Homo sapiens] 



4e-06 



WO 01/02568 



PCT/US00/18374 





Nearest Neighbor (BlasiN vs. Genbank) 


Nearest Neighbor (BlastX vs, Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












(ALUU^yyu) Contains repeated 




1601 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


region with similarity to 
gb|U4J0i/ extenstn (aibxtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|ZI8788 
come from this gene. 
(Arabidopsis thaliana] 


2e-06 


1602 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28. 


4056454 


(AUX)5yyO) Contains repeated 
region with similarity to 
gu|U4jo-/ extensin ^aLcxtij 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z18788 
come from this gene. 
[Arabidopsis thaliana) 


2e-06 


1603 


AF100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


le-28 


4056454 


(AC0059yU) Contains repeated 
region with similarity to 
gD|U4jOi/ extensin ^attxu; 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z187S8 
come from this gene. 
(Arabidopsis thaliana] 


9e-07 


1604 


API 00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(AUXWyyU) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z187SS 
come from this gene. 
[.Arabidopsis thaliana] 


8e-07 


1605 


AF 100694 • 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


1169643 


£mrFamid£i-R]~laT£i!) 

NcUROPcrTIUho 
PRECURSOR >gi|41620S 
(U03 1 37) neuropeptide 
precursor FMRFamide-related 
peptide [Lymnaea stagnalis] 


7e-07 


1606 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(AC005990) Contains repeated 
region with similarity to 

gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Zl87S8 
come from this gene. 
[Arabidopsis thaliana] 


6e-0~ 


1607 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(AC005990) Contains repeated 
region with similarity to 
sb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gbiZlS7SS 
come from this gene. 
Arabidopsis thaliana] 


5e-0~ 



WO 01/02568 



PCTYUS00/18374 





Nearest Neighbor (BlasuN vs. Genbank) 


Nearest NeishboriBlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












(AClXDyyU) Contains repeated 




1608 


AF I 00694 


Mus musculus 
Poniin52 mRNA, 
complete cds 


le-28 


4056454 


region with similarity to 
phlU436'?7 extensin fatE^tH 

W~^Ww t W AiWU Jill ^UiAtfit^ 4 / 

gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z 18788 
come from this gene. 
[Arabidopsis thaliana] 


3e-07 


1609 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


lc-28- 


4056454 


(ACUUDyvU) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z 18788 
come from this gene. 
[Arabidopsis thaliana] 


. le-07 


1610 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-2S 


4056454 


(ACUUO^UJ Contains repeated 
region with similarity to 
eblU436' 7 7 extensin (atExtO 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z 18788 
come from this gene. 
[Arabidopsis thaliana] 


ie-07 


1611 


AF 1 00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(AC005990) Contains repeated 
region with similarity to 
2b|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z18788 
come from this gene. 
[Arabidopsis chalianal 


7e-0S 


1612 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


Je-28 


4056454 


(ACOODyyO; Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z1878S 
come from this gene. 
[Arabidopsis thaliana] 


2e-08 


1613 


AF 1 00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


Ie-28 


4056454 


(AC0059yO) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34I65 and gb|ZlS788 
come from this gene. 
[Arabidopsis thaliana] 


6e-09 



WO 01/02568 



PCT/US00/18374 





Nearest Neighbor (BlastN vs. Gcnbank) 


Nearest Neishbor (BtastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












(ACOUoyyOj Contains repeated 




1614 


AF 100694 


Mus musculus 
Pomin52 mRNA, 
complete cds 


Le-28 


4056454 


region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thai i ana. 
ESTs gb|Z34165 and gb|Z 18788 
come from this gene. 
(Arabidopsis thaliana] 


5e-09 


1615 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(ACOUoyvU) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34l65 and gb|Z 18788 
come from this gene. 
[Arabidopsis thaliana] 


4e-09 


1616 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


lc-2S 


4056454 


(ACOO^yyO) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34l65 and gb|Z 18788 
come from this gene. 
[Arabidopsis thaliana] 


7c- 10 


1617 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(ACGO5990J Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z18788 
come from this gene. 
[Arabidopsis thaliana] 


6e-10 


1618 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(AC005990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from .Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Zl8788 
come from this gene. 
[Arabidopsis thaliana] 


5e-10 


1619 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-2S 


4056454 


(ACOODyyO) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from .Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z 13788 
come from this gene. 
[Arabidopsis thaliana] 


4e-l0 



WO 01/02568 



PCT/US00/18374 





Nearest 


Neighbor (BlastN vs. Cenbank) 


Nearest Neighbor ( BlastX vs. Non- Redundant Proteins) 


SEQ 
ID 


ACCESS 101 s 


f DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












(ACOUoyyO) Contains repeated 




1620 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana, 
ESTs gb|234165 and gb|Z18788 
come from this gene. 
[Arabidopsis thalianal 


2e-10 


1621 


AF100694 


tofnc miiviiltic 

Pontin52mRNA, 
complete cds 


le-28 


4056454 


(ALUUDyW) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
hoTs gb|Z3416D andgb|Z18788 
come from this gene. 
Arabidopsis thaliana] 


5c- 11 


1622 


AF 100694 


Mus musculus 

Priming"? mPNfA 

complete cds 


le-28 


4056454 


(ACOU^y^Oj Contains repeated 
region with similarity to 
gDju-oo^/ extensin (aLtxtlj 
gene from Arabidopsis thaliana. 
ESTs gb|234165 and gb!Z18788 
come from this gene. 
[.Arabidopsis thaliana] 


2e-l2 


1623 


AF032896 


Petromyzon marinus 
polyadenylate binding 
protein 


le-28 


1082703 


polyadenylate binding protein II 
human 


2e-27 


1624 


AFI00694 


Mus musculus 
complete cds 


9e-29 


118588 


DEHYDRIN DHN3 
>gi|t00035|pir||S 18139 dehydrin 
DHN3 - garden pea >gi|20709 
(X63063) pea dehydrin DHN3 
Pisum sativum] 


0.013 


1625 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


9e-29 


2133579 


spermatophorin Sp23 - yellow 
mealworm moliior| 


6e-04 


1626 


AF 1 00694 < 


Mus musculus 
Ponlin52 mRNA, 
:omplete cds 


9e-29 


3876465 i 


(Z81071) predicted using 
jenefinder; Similarity to 
Human small nuclear 
ribonucleoprotein E cDNA EST 
^k375g7.5 comes from this 
zene; cDNA EST yk435t5.3 
:omes from this sen... 


9e-06 


1627 


J 
I 

AF 100694 c 


tfus musculus 
>ontin52 mRNA, 
omplete cds 


8e-29 


{ 
r 

c 
t 

fi 
E 

4056454 [ 


ACUU5990) Contains repeated 
egion with similarity to 
!b|U43627 extensin (atExtl) 
tene from .Arabidopsis thaliana. 
ISTs gb|Z34l65 and gb:ZlS7SS 
ome from this gene. 
Arabidopsis thaliana] | 


2e-06 



IT' * i 



WO 01/02568 



PCTAJS00/18374 





Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins* 


SEQ 
ID 


ACCESSION 


r DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












AD^RJBOSVLAriON 




10-0 


AF100694 


Mus musculus 
Pomin52 mRNA, 
complete cds 


4e-29 


728883 


FACTOR 3 truit tly (Drosophila 
melanogaster) >gi|507234 
(L25063) ADP ribosylation 
factor 3 [Drosophila 
melanoeasier] 


0.016 


1629 


AF 1 00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


4e-29 


544357 


RNA-BINDING PROTEIN 
FUS/TLS protein [human. 
Peptide. 526 aa] [Homo saDiens] 


2e-07 


1630 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


4e-29 


4056454 


(ACU0D99U) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34I65 and gb|Z18788 
come from this gene. 
Arabidopsis thaliana] 


le-08 


1631 


D43682 


Human mRNA for 
very-long-chain acyl- 
CoA dehydrogenase 
(VLCAD). complete 
cds 


4e-29 


1168287 


ACVL-OJA 

DEHYDROGENASE. VERY- 
LONG-CHAIN SPECIFIC 
PRECURSOR (VLCAD) 
dehydrogenase precursor - rat 
Acyl-CoA dehydrogenase 
Rattus norvegicus] 


6e-37 


1 Oji 


Y07660 


M.tuberculosis accBC 
gene 


4e-29 


2113935 


Z95556) accDl 
Mycobacterium tuberculosis] 


3e-47 


1 All 


X55367 


■luman alpha-satellite 
DNA from clone 
pTRA-2. 


le-29 


<NONE> 


<NONE> 


<NONE> 


1634 


LSI 866 


Homo sapiens 

(subclone I f 1 from 

PI H54) DNA 
sequence 


le-29 


<NONE> 


<NONE> ■ 


<NONE> 


1635 


S75940 


f Alu repeats, clone 
52HIO) [human, 
colonic mucosa. 
Genomic. 943 nt] 


le-29 


728831 


!!!! ALU SUBFAMILY J 
WARNING ENTRY 


Ie-07 


1636 


AB001907 


Homo sapiens 
PACE4 sene. exon 13 


le-29 


728S31 


!!! .ALU SUBFAMILY J 
WARNING ENTRY 


2e-09 


1637 


I 
< 

AF077003 t 


Vlus musculus SH3 
iomain-containing 
idapter protein 
uRNA. complete cds 


5e-30 


<NONE> 


<NONE> 


<NONE> 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












(ACKbWO) Contains repeated 




1638 


AF 100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


4e-30 


4056454 


region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|234165 and gb|Z18788 
come from this gene. 
[Arabidopsis thaliana] 


3e-10 


1639 


M27072 


Xenopus laevis 
poly(A)-binding 
protein (ABP-EFj 
mRNA. complete cds. 


4e-30 


' 1352709 


POLYADENYLATE- 
BINDING PROTEIN 
polyadeny late-binding protein * 
African clawed frog laevisl 


5e-21 


1640 


X58386 


B.taurus mRNA for 
hfivine vacuolar 
ATPase subunit A 


2e-30 


2773154 


(AF039573) abscisic acid- and 
stress- inducible protein 


4.3 


1641 


Y07660 


M.tuberculosis accBC 
gene 


le-30 


2113935 


(Z95556) accDl 
[Mycobacterium tuberculosis] 


4e-47 


1642 


AJ236940 


Sus scrota mRNA for 
hypothetical protein 
(5': clone 7C4) 


4e-3l 


4102021 


(AF007561) delta 6-desaturase 
[Boraeo officinalis] 


7.4 


1643 


AF039400 


Homo sapiens 
calcium-dependent 
chloride channel- 1 
(hCLCAl) mRNA. 
complete cds 


2e-31 


3721912 


(ABO 17 156) gob-5 [Mus 
musculus] 


7e-08 


1644 


L77036 


Homo sapiens 
(subclone 5_d9 from 
PI H19) DNA 
sequence. 


le-31 


461663 


BOMBYXIN B-2 HOMOLOO 
PRECURSOR silkmoth 
>gi|217385|gnl|PID|d 1003528 
(D13924) Samia bombyxin 
homoloa B-2 [Samia cynthial 


l.l 


1645 


X6I971 


H.sapiens mRNA for 
macropain subunit 
delta 


le-31 


296734 


(X61971) macropain subunit 
delta [Homo sapiens] 


3e-06 


1646 


L00016 


human mitochondrial 
trnas and partial 
proteins 4 & 5; 
histidyk seryk 
leucyl-tma genes; 
urf4 and urf5 
(partial). 


5e-32 


4056454 


(ACUO^yyU) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34l65 and gb|ZlS783 
come from this gene. 
(Arabidopsis thaliana] 


0.002 


1647 


M17SS7 


Human acidic 
ribosomal 
phosphoproiein P2 
mRNA. complete cds. 


5e-32 


4056454 


(ACOO^yyU) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34l65 and gb|Z 13788 
come from this gene, 
(Arabidopsis thaliana] 


le-05 
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SEQ 
ID 


Nearest N 
ACCESSION 


eighbor(BiastN vs. Ge 
DESCRIPTION 


nbank) 
P VALUE 


Nearest Neighbo 
ACCESSION 


r (BlastX vs. Non-Redundant Pro 
DESCRIPTION 


teins) 
P VALUE 


1659 


i 

U53446 


iuman mitogen- 
responsive 

phosphoprotein DOC- 
2 mRNA, complete 
cds. 


6e-34 


3395443 


(AC004683) putative 
ammonium transporter. 3' partial 


4.7 


1660 


AF013988 


Homo sapiens serine 
protease mRNA, 
complete cds 


4e-34 


• 2507226 


PROTEIN-TYROSINE 
PHOSPHATASE EPSELON 
PRECURSOR (R-PTP- 
EPSILON) >gi| 1439605 
(U62387) protein tyrosine 
phosphatase-e [Mus musculus] 


3.2 


1661 


U53446 


Human mitogen- 
rcsponsive 

phosphoprotein DOC- 
2 mRNA, complete 
cds. 


2e-34 


104757 


chicken >ei|2 12254 sallusl 


1.6 


1662 


AJ233632 


Homo sapiens 
endogenous retroviral 
sequence ERV-L pol 
gene, clone ERV-L 
Human6 


2e-34 


3860513 


(AJ233597) reverse 
transcriptase [Mus famulus! 


4e-l0 


1663 


AF086310 


Homo sapiens full 
length insert cDNA 
clone ZD51F08 


8e-35 


2947070 


( AmfPV n nutative Ser/Thr 
protein kinase [Arabidopsis 
thalianal 


2.3 


1664 


X 17206 


Human mRNA for 
LLRep3 


3e-35 


730652 


RIBOSOMAL PRUlfelN 
S2 (STRINGS OF PEARLS 
PROTEIN) 

>gi|1085l53|pir||S50325 

rih/Knmnl nrnrpin - fruit nV 

(Drosophila melanogaster) 
melanogaster] >gi|5 15972 
(U01335) ribosomal protein S2 


2e-10 


1665 


AB011137 


Homo sapiens mRNA 
for KIAA0565 
protein, complete cds 


3e-35 


3043654 


(AB01 1 137) KIAA0565 protein 
[Homo sapiensl 


2e-l6 


1666 


U62801 


Human protease M 
mRNA. complete cds 


2e-35 


3929231 


(AF091247) potassium channel 
[Ratius norvcatcusl 


1.0 


1667 


AF020760 


Homo sapiens serine 
protease (Omi) 
mRNA. complete cds 


le-35 


2738915 


(AF020760) serine protease 
[Homo sapiensl 


9e-U 
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Nearest Neighbor (BlastN vs. Genbank) 


. Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Human DNA 










1668 


Z93943 


sequence from 
cosmid U235H3 on 
chromosome X 


8e-36 


1196432 


(M22333) unknown protein 
[Homo sapiens] 


3e-10 


1669 


X06778 


Rabbil 18S rRNA 


7e-36 


118588 


DEHYDR1N DHN3 
>gi|100035|pirj|S 18139 dehydrin 
DHN3 - garden pea >gi|20709 
(X63063) pea dehydrin DHN3 
[Ptsum sativuml 


0.01 1 


1670 


AB007962 


Homo sapiens 
mRNA, chromosome 
1 specific transcript 
KIAA0493 


3e-36 


3329243 


(AE00 1350) hypothetical 
protein [Chlamydia trachomatis] 


3.1 


1671 


Z810L4 


Human DNA 
sequence from 
cosmid U65A4, 
between markers 
DXS366 and DXS87 
on chromosome X * 


3e-36 


141103 


HYPOTHETICAL PROTEIN 
ORF-1137 mouse 


0.038 


1672 


281014 


Human DNA 
sequence from 
cosmid U65A4, 
between markers 
DXS366 and DXSS7 
on chromosome X * 


3e-36 


198651 


(M29325) ORF1 [Mus 
musculus] 


0.006 


1673 


U49082 


Human transporter 
protein (g 17) mRNA. 
complete cds 


3e-36 


1840045 


(U49082) transporter protein 
[Homo sapiens] 


2e-l5 


1674 


J03133 


Human transcription 
factor SPL mRNA. 3' 
end. 


3e-36 


477133 


HF-l regulatory element binding 
protein - rat 


2e-31 


1 0/5 


AB007934 


Homo sapiens mRNA 
for KIAA0465 
protein, partial cds 


te-36 


3413892 


(ABOU/yjH) kjaaiwoj protein 
[Homo sapiens] 


4e-37 


1676 


M34857 


Mouse Ho.x-2.5 
mRNA. 


9e-37 


106296 


homeoiic protein Hox B9 - 
human (frasment) 


0-15 


1677 


L35657 


Homo sapiens 
(subclone HS 5_al0 
from PI 35 H5 C8) 
DNA sequence. 


9e-37 


2072960 


(U93568) p40 [Homo sapiens] 


3e-05 


1678 


X80240 


H.sapiens 
endogenous 
retrovirus HERV- 
KC4 DNA 


3e-37 


4135944 


(Y17S33) env protein [Human 
endogenous retrovirus K] 


Le-15 
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SEQ 
TP 


Nearest N 
ACCESSION 


eiehbor (BlastN vs. Ge 
DESCRIPTION 


nbank) 
P VALUE 


Nearest Neigh bo 
ACCESSION 


r (BlastX vs. Non-Redundant Pro 
DESCRIPTION 


teins) 

P VALUE 


1679 


< 

Z93943 < 


Kuman DNA 
sequence from 
:osmid U235H3 on 
:hromosome X 


9e-38 


106322 


hypothetical protein (L1H 3* 

reciOm - numan 
HYFUlHtuCALZlWL 


4e-13 


1680 


X97303 


Rsapiens mRNA for 
Pta-12 protein 


4e-38 


466044 


FINGER PROTEIN ZK686.4 
IN CHROMOSOME III 
>gi|630780|pir||S44909 ZK686.4 
protein - Caenorhabditis elegans 
>gi|304346 (L17337) coded for 
hv C eleenns cDNAS 
GenBank:M88869 and T01933; 
putative [Caenorhabditis 
eleaans] 


3e-37 


1681 


Y08999 


Rsapiens mRNA for 
Sop2p-like protein 


3e-38 


3334339 


SOP2-LIKE PROTEIN 


5e-06 


1682 


Z62887 


Rsapiens CpG DNA, 
clone 74g6. forward 
read cpg74z6.ftla . 


2e-38 


1245686 


(\ 1 1 9 n F16D4 ° oene 
product [Caenorhabditis 
elesans] 


0.19 


1683 


U35032 


Human enaogenous 
retrovirus clone 
c5.11,HERV-H 
multiply spliced 
subgenomic leader, 
protease and integrase 
region mRNA, partial 
cds 


le-38 


59977 


(Z14310) tripartite fusion 
transcript PLA2L [Human 
endogenous retrovirus] 


le-06 


1684 


D86974 


Human mRNA for 
KIAA0220 gene, 
partial cds 


le-38 


3337386 


(AC002544) Unknown gene 
product splice form-2 [Homo 
sapiens] 


8e-li 


1685 


M31013 


Human nonmuscle 
myosin heavy chain 
(NMHC) mRNA. 3' 
end. 


le-38 


411574S 


(AB 022023) nonmuscle myosin 
heaw chain B 


2e-ll 


1686 


AF006087 


Homo sapiens Arp2/3 
protein complex 
subunit p20-Arc 
(ARC20) mRNA, 
complete cds 


4e-39 


<NONE> 


<NONE> 


<NONE> 


1687 


X58374 


D.melanogaster cm 
mRNA 


4e-39 


2655888 


(AL009171) 62D9.a 
[Drosophila melanogasterl 


4e-42 


1688 


D85815 


Human DNA for 
rhoHPL complete cd. 


i le-39 


134080 


GTP-BINDING PROTEIN 
TC10 ras-like protein [Homo 
sapiens! 


3e-26 
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Nearest Neighbor (BlastN vs. Gen bank) 


Nearest Neifihbor (BlastX vs. No n- Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1689 


U49057 


Rattus norvegicus 
CTD-binding SR-like 
protein rA9 mRNA, 
complete cds 


4e-40 


1438534 


(U49057) rA9 [Rattus 
norvegicus) 


5e-05 


1690 


Y08999 


H.sapiens mRNA for 
Sop2p-like protein 


4e-40 


3334339 


SOP2-LIKE PROTEIN 


9e-08 


1691 


AB002293 


Human mRNA for 
KIAA0295 gene, 
partial cds 


4e-40 


2224531 


(AB 002293) KIAA0295 [Homo 
sapiens] 


le-30 


1692 


AF086222 


Homo sapiens full 
length insert cDNA 
clone ZC66E08 


le-40 


2829669 


DOUBLE-STRANDED RNA- 
SPECDFIC EDITASE 1 
(DSRNA ADENOSINE 
DEAMINASE) (RNA 
EDITING ENZYME 1) 
>gi|1707502|gnl|PID|e254627 
(X99227) double-stranded RNA 
specific editase [Homo sapiens] 
editase 1 hREDl-L [Homo 
sapiens] >gi|2039300 (U76421) 
dsRNA adenosine deaminase 
DRADA2b [Homo sapiens] 


0.61 


1693 


AF044127 


Homo sapiens 
peroxisomal short- 
chain alcohol 
dehydrogenase 
(SCAD-SRL) mRNA, 
complete cds 


le-40 


4105190 


(AF044127) peroxisomal short- 
chain alcohol dehydrogenase 


2e-06 


1694 


U36778 


Mus musculus Sil 
mRNA, complete cds 


le-40 


88608 


SIL protein - human >gi|338088 
(M74558) SIL 


6e-23 


1695 


U3677S 


Mus musculus Sil 
mRNA. complete cds 


le-40 


88608 


SIL protein - human >gi|33S0SS 
(M7455S) SIL 


6e-23 


1696 


U36778 


Mus musculus Sil 
mRNA. complete cds 


le-40 


88608 


SIL protein - human >gi|3380SS 
(M74558)SIL 


5e-23 


1697 


U36778 


Mus musculus Sil 
mRNA. complete cds 


le-40 


88608 


SIL protein - human >gi|3380SS 
<M7455S)SIL 


5e-23 


1698 


ABO 18285 


Homo sapiens mRNA 
for KIAA0742 
protein, partial cds 


le-40 


3882205 


(AB0182S5) KIAA0742 protein 
Homo sapiensl 


6e-3l 
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Nearest Neighbor fBlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION . 


DESCRIPTION 


P VALUE 












ATP- BINDING CASSETTE 




1699 


X75927 


M.musculus abc2 
mRNA 


le-40 


728773 


TRANSPORTER 1 ABC1 - 
human >gi|495257 (X75926) 
abel [Mus musculus] 


3c-37 


1700 


AF038200 


Homo sapiens clone 
23954 mRNA 
sequence 


5c-41 


3211975 


(AF068195) putative 
glialblastoma cell differentiation- 
related protein [Homo sapiens] 


5e-14 


1701 


U20521 


Human estrogen 
sulfotransferase 
(STE) gene, exon S 
and complete cds 


4e-4i 


- <NONE> 


<NONE> 


<NONE> 


1702 


AF026548 


Homo sapiens 
branched chain alpha- 
ketoacid 

dehydrogenase kinase 
precursor, mRNA, 
nuclear gene 
encoding 
mitochondrial 
protein, complete cds 


2e-41 


» 

3182923 


[3-METHYL-2- 
OXOBUTANOATE 
DEHYDROGENASE 
(LIPOAMIDE)] KINASE 
PRECURSOR alpha-ketoacid 
dehydrogenase kinase precursor 
[Homo sapiens] 


2e-09 


1703 


Y07660 


M. tuberculosis accBC 
jzene 


2e-41" 


465847 


HYPOTHETICAL 66.5 KD 
PROTEIN F02A9.5 IN 
PHP ON/TO ^DMF [IT 

>gi|2S0542|pir||S28313 
hypothetical protein F02A9.5 - 
Caenorhabditis elegans 
Genefinder: similar to Propionyl- 
CoA carboxylase beta chain; 
cDNA EST EMBL:M89CU8 
comes from this gene; cDNA 
EST EMBL:D28069 comes 
from this gene; cDNA EST 
EMBL:D2S068 comes from this 
gene; cDNA EST ... 




1704 


AG001237 


Homo sapiens 
genomic DNA, 21q 
region, clone: 
9H11N46 


le-41 


106322 


hypothetical protein (L1H 3' 
region) - human 


5e-09 


1705 


AB007934 


Homo sapiens mRNA 
for KIAA0465 
protein, partial cds 


lc-41 


3413892 


(AB0079341 KIAA0465 protein 
[Homo sapiens] 


3e-12 


1706 


AF055029 


Homo sapiens clone 
24711 mRNA 
sequence 


5e-42 


3250681 


(AL0244S6'> putative protein 


2.2 
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Nearest Neighbor fBlastN vs. Genbank) 


Nearest Neiahbor fBlastX vs. Non-Redundant Pre 


ueins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












l- 




1707 


Z49747 


O.cuniculus mRNA 
for phospholipase C 


5e-42 


130227 


raOSPHAriDYLlNUSilOL- 
4,5-BISPHOSPHATE 
PHOSPHODIESTERASE 
DELTA 1 (PLC-DELTA- 1) 
(PHOSPHOLIPASE C-DELTA- 
1) (PLC-III) >gi| 163538 
(M20638) phospholipase C-III 
(Bos taurus] 


5e-36 


1708 


M93651 


Human set gene, 
complete cds. 


2e-42 


<NONE> 


<NONE> 


<NONE> 


1709 


AJ236940 


Sus scrofa mRNA for 
hypothetical protein 
<5': clone 7C4) 


2e-42 


2062403 


(U79010) delta 6 desaturase 
[Boraao officinalis] 


8.5 


1710 


J03634 


Human erythroid 
differentiation protein 
mRNA 


2e-42 


1708436 


INHIBIN BETA A CHAIN 
PRECURSOR 


2e-10 


1711 


AJ223777 


Mus musculus mRNA 
for striatin 


6e-43 


2494917 


STRIATIN 

>2i|1495773)2nIIPID|e254158 


2e-32 


1712 


AF0164U 


Homo sapiens 
potassium channel 
subunit KCNA3.1B 


2e-43 


2708514 


(AF016411) KCNA3.1B [Homo 
sapiens] 


3e-l3 


1713 


AC001443 


Homo sapiens 
(subclone 2 J: 10 from 
BAC 2913 


le-43 


111814 


hypothetical protein 3 - rat 
>ei|565S9 


2e-06 


1714 


X82895 


H. sapiens mRNA for 
DLG2 


6e-44 


2497511 


MAGUK P55 SUBFAMILY 
MEMBER 2 (MPP2 PROTEIN) 
(DISCS. LARGE HOMOLOG 
2) 


6e-52 


1715 


U17077 


Human BENE 
mRNA. partial cds. 


3e-44 


53912 


(X57960) ribosomal protein L7 
[Mus musculusl >2i|55489 


8e-30 


1716 


AJ222700 


Homo sapiens mRNA 
forTSC-22 protein 


2e-44 


<NONE> 


<NONE> 


<NONE> 


1717 


J03634 


Human erythroid 
differentiation protein 
mRNA 


2e-44 


124279 


LNH1B1M Bhl A A LHAliS 
PRECURSOR PROTEIN) 
(EDF) >gi|87936|pir||B24248 
inhibin beta- A chain precursor - 
human >gi|lSl947 (J03634) 
erythroid differentiation protein 
precursor [Homo sapiens] 
sapiensl 

>gi|22bS50|prt1|160S26OB 
inhibin betaA [Homo sapiens] 


0.73 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1718 


AB0U518 


Homo sapiens mRNA 
for KIAA0618 
protein, complete cds 


7e-45 


1911548 


(S80864) cytochrome c-like 
polypeptide sapiens] 


1.6 


1719 


X76808 


H.sapiens genomic 
DNA clone d2 


7e-45 


868201 


(U29380) similar to adenylate 
cyclase [Caenorhabditis elegans] 


2e-09 


1720 


AB021288 


Homo sapiens mRNA 
for beta 2- 
microglobulin, 
complete cds 


2e-45 


2465521 


(U95995) RNA-dependent RNA 
polymerase [Cryptosporidium 
parvum] 


0.15 


1721 


X63468 


H.sapiens mRNA for 
transcription factor 
TFIIE alpha 


8c-4'6 


<NONE> 


<NONE> 


<NONE> 


1722 


AF019226 


Homo sapiens D2-2 
mRNA, 3'UTR 


7e-46 


<NONE> 


<NONE> 


<NONE> 


1723 


D31764 


Human mRNA for 
KIAA0064 gene, 
complete cds 


2e-46 


3123050 


HYPOTHETICAL PROTEIN 
KIAA0064 


le-15 


1724 


K02774 


Human MHC class 11 
HLA-DR-beta-psi 
(DW4/DR4) 
pseudogene, exons 
3.4, 5,6, clones cos II- 
3301 and cosll-80l. 


le-46 


4185946 


(Y17S34) gag protein [Human 
endoaenous retrovirus K] 


2e-l4 


1725 


X92109 


H.sapiens hcsIX gene 


9e-47 


2498185 


BRIDE 0FS£V£NLEi& 
PROTEIN PRECURSOR 
>gi|l079l66;pir||A47550 bride 
of seven less precursor - fruit fly 
(Drosophila viritis) >gi|2902l6 
virilisl 


1.4 


1726 


X93334 


H.sapiens 

mitochondrial DNA, 
complete genome 


8e-47 


128753 


NAD H- U B IQ U 1NONE 
OXIDOREDUCTASE CHAIN 
4>gi|86696|pir||A00435 NADH 
dehydrogenase (ubiquinone) 


4e-l5 


1727 


M35145 


Human tumor 
necrosis factor 
receptor. 3* flank. 


3e-47 


<N0NE> 


<NONE> 


<NONE> 


1728 


X 80240 


H.sapiens 
endogenous 
retrovirus HERV- 
KC4 DNA 


3e-47 


4185944 


(Y17S33) env protein [Human 
endoaenous retrovirus K] 


7e-lS 


1729 


Z63594 


H.sapiens CpG DNA. 
clone 87t9. forward 
readcpeS7t'9.t*tla . 


le-47 


3322743 


(AE00L222) T. pallidum 
predicted coding reeion TP0454 


2.4 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. No n- Redundant Proteins) . 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






R.rattus mRNA for 










1730 


X62295 


vascular type- 1 
angiotensin U 
receptor 


4e-48 


1209756 


(U43629) integral membrane 
protein [Beta vulgaris] 


le-07 


1731 


M85145 


Human tumor 
necrosis factor 
receptor, 3* flank. 


3e-48 


<NONE> 


<NONE> 


<NONE> 


1732 


AB020712 


Homo sapiens mRNA 
for KIAA0905 
protein, complete cds 


4e-49 


4240299 


(AB020712) KIAA0905 protein 
[Homo sapiensl 


2e-20 


1733 


AB0207I2 


Homo sapiens mRNA 
for KIAA0905 
protein, complete cds 


3e-49 


4240299 


(AB020712) KJAA0905 protein 
[Homo sapiens! 


2e-":0 


1734 


X62295 


R.rattus mRNA for 
vascular type- 1 
angiotensin II 
receptor 


le-49 


1209756 


(U43629) integral membrane 
protein [Beta vulgaris! 


7e-l2 


1735 


AJ007509 


Homo sapiens mRNA 
forElB-55kDa- 
associated protein 


le-49 


3319956 


(AJ007509)EiB-55kDa- 
associated protein 


4e-24 


1736 


X97303 


H. sapiens mRNA for 
Pt2-12 protein 


le-49 


466044 


H \ fU I Ht HL AJ- ^.ilN^ 

FINGER PROTEIN ZK686.4 
IN CHROMOSOME III 
>gi|6307S0|pirj|S44909 ZK686.4 
protein - Caenorhabditis elegans 
>gt|304346 (,L 17337) coded for 
by C. elegans cDNAs 
GenBank:MSS869 and T01933; 
putative [Caenorhabditis 
elezans] 


j 

8e-3* ' 


1737 


AF03S404 


Homo sapiens 
homo log of Nedd5 
(hNeddS) mRNA t 
complete cds 


4e-50 


<NONE> 


<NONE> 


<NON 


1738 


L43618 


Homo sapiens 
polycystic kidney 
disease (PKDl) gene, 
exons 35-42 


4e-50 


90375S 


(L43619) polycystic kidney 
disease 1 protein [Homo 
sapiens] 


3e-' ■ 


1739 


AF009424 


Homo sapiens clone 
22 mRNA, alternative 
splice variant alpha- 1. 
complete cds 


4e-50 


2271473 


(AF009426) clone 22 [Homo 
sapiens] 


! 

5c 



WO 01/02568 
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Nearest Neiehbor ( BlasiN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Pro 


teins) 


SEQ 
IT) 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












monosacchand transport protein 




1740 


L77040 


Homo sapiens 
(subclone 8_c 1 1 from 
PI H22) DNA 
sequence. 


2e-50 


99758 


STP4 - Arabidopsis thaliana 
>gi| 16524 (X66857) sugar 
transport protein [Arabidopsis 
thai i anal 


6.4 


174 1 


L35657 


Homo sapiens 

» * « tin e .in 

(subclone H8 5_alO 
from PI 35H5C8) 
DNA sequence. 


2e-50 


2072960 


(U93568) p40 [Homo sapiens] 


2e-05 


1742 


U80745 


Homo sapiens CTG7a 
mRNA, partial cds 


le-50 


<NONE> 


<NONE> 


<NONE> 


1743 


D84514 


Bovine mRNA for 
p97. partial cds 


ieou 




(AF10372S) structural 

nnlvnrnrr in fSindbis virusl 


9.9 


1 Id A. 


M22960 


Human protective 
protein mRNA. 
complete cds. 


le-50 


131081 


tTSSS-JMALPROTeCTIvH " 
PROTEIN PRECURSOR 
fTVTHEPSIN \) 
(CARBOXYPEPTIDASE C) 
nutnuti — 1 1 i 7u»j» 
protective protein precursor 


le-12 




X360I3 


H.sapiens mRNA for 
MUFl protein 


le-50 


1082610 


mufl protein - human 
^cil7fPQ*Sj (X^OIS} mufl 
[Homo sapiens] 


le-21 


1746 


U03495 


Human transcription 
factor LSF-ID 
mRNA, complete cds. 


7e-5I 


2136296 


transcription factor LSF - human 
>2i|476099 


le-21 


17*17 
if**/ 


ABO 15344 


Homo sapiens 
HRIHFB2I57 
mRNA. partial cds 


5e-5l 


3970874 


r ARfil SliU.) HRIHFB* , 157 
[Homo sapiens] 


2e-35 


174S 


M93339 


Human zinc finger 
protein mRNA. 


4e-5l 


3024 110 


MYC-ASSOCIATED ZINC 
FINGER PROTEIN sapiens! 


2e-06 


1749 


U71363 


Human zinc finger 
protein zfp6 (2F6) 
mRNA. partial cds 


4e-51 


2689441 


(AC0036S2) F1S547_1 [Homo 
sapiens] 


2c- U 


1750 


X56932 


H.sapiens mRNA for 
23 kD highly basic 
protein 


4e-5l 


730451 


^feTHBO^OiVL-vLPROlhlN 
LL3A (23 KD HIGHLY BASIC 
PROTEIN") 

>gi|345S97|pir!;S29539 basic 
protein. 23K - human >gi|2369 1 
(X56932) 23 kD highly basic 
protein [Homo sapiens] 




1751 


Z79054 


H.sapiens tlow-soned 
chromosome 6 
HindlH fragment. 
SC6pA2lEll 


2e-5L 


<NONE> 


<NONE> 


<NONE> 



10 a. 
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Nearest Neighbor fBlastN vs. Genbank) 


• Nearest Neiehbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Homo sapiens 










1752 


AF068245 


BAF60b gene, partial 
sequence 


5e-52 


<NONb> 


<INL/tNC> 




1753 


AJ236932 


Sus scrofa mRNA for 
hypothetical protein 
(5*; clone 4B8) 






RIBONUCLEOPROTEIN 
RB97D ribonucleoprotein 
[Urosopnua rneianussasieri 


4.7 


1754 


AF003693 


Mus musculus 
scaffold protein Pbpl 
homolog mRNA, 
complete cds 


6e-53 


2197106 


( AFOO'ifiQj'i scaffold d rote in 
Pbpl homolos fMus musculusl 


2e-54 


1755 


M27319 


Human calmodulin 
mRNA, complete cds. 


5c-53 


115528 


CALMODULIN 
>gi|102408|pir[|JC1309 
calmodulin - Stylonychia lemnae 
(SGC5) >fii|l61195 


0.002 


1756 


M74555 


Mouse house-keeping 
protein mRNA, 
complete cds. 


5eoJ 




house-keeping protein - mouse 




1757 


X92720 


H.sapiens mRNA for 
phosphoenolpyruvate 
carboxvkinase 


6e-54 


2135915 


phosphoenolpyruvate 
carboxvkinase (GTP) (EC 

H.l.l.Ja^ UIV^Ul Jul , 

mitochondrial - human 
carboxvkinase (GTP) [Homo 
sapiens] 


6e-21 


1758 


AF007872 


Homo sapiens torsinB 
(DQ1) mRNA, partial 
cds 


2e-D4 


i. 1 OU I J. 1 


(AB002405) LAK-4p [Homo 
sapiens! 


0.27 


1759 


U49507 


Mus musculus 
B6CBA Lisch7 
mRNA, partial cds. 


2e-54 


1236083 


(U49507) Lisch7 [Mus 
musculus] 


3e-27 


1760 


Z73360 


u lim . m DMA 

sequence from 
cosmid 92MI8, 
BRCA2 gene region 

fhrr^mncomt" 1 "\ci 1 ^ - 
UlllUjUlilt' 1 JVJ i — 

13. 


le-55 


2370371 


( Y 14657) hydrophobia 
[Pleurotus ostreatus] 
>gi|29S2620|gnl|PID|e 1283986 
(AJ225061) POH2 hydrophobin 
(Pleurotus ostreatusl 


2.0 


1761 


U83702 


Human cytochrome c 
oxidase subunit Via 
gene, exon 3 and 
complete cds 


8e-56 


2982994 


(AE0006S2) hypothetical 
protein [Aquifex aeolicusl 


7.0 


1762 


Y127S1 


Homo sapiens mRNA 
for transduein (beta) 
like 1 protein 


7e-56 


3021409 


(Y127S1) transduein (beta) like 
I protein [Homo sapiens] 


7e-39 
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N*»nr«f Npiohhnr (RiasrN" vs. GenbanK) 


Nearest Neighbor (BlastX vs. Non-Redundant Pro 


teins) 


SEQ 

rn 
LU 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1763 


AB020673 


Homo sapiens mRNA 
for KIAA0866 
protein, complete cds 


8e-57 


2104553 


(AF001548) Myosin heavy 
chain (MHY1 1) (5'partial) 
fHomo sapiens] 


4e-04 


1764 


AJ236932 


Sus scrofa mRNA for 
hypothetical protein 
(5': clone 4B8) 


3e-57 


400927 


RIBONUL-LcUrKU IcJiN 
RB97D ribonucleoprotein 
[Drosophita melanoaaster] 


4.7 


1765 


L06900 


Human dystrophin 
gene, intron I 
containing pseudo 
exon. 


le-58 


. 4185129 


(AC005724) unknown protein 
[Arabidopsis thalianal thalianal 


7.0 


1766 


X93334 


H.sapiens 

mitochondrial DNA, 
complete senome 


9e-59 


1492050 


(U60315) MC107L [Molluscum 
contasiosum virus subtype 11 


0.17 


1767 


AF064856 


Rattus sp. 7acomp 
protein mRNA, 
complete cds 


3e-59 


3169626 


(AF064856) 7acomp protein 
[Rattus sp.l 


2e-3t 


1768 


AF081484 


Homo sapiens aipha- 
tubulin isoform I 
mRNA, complete cds 


2e-59 


32015 


(X06956) alpha-tubulin [Homo 
sapiens] 


4e-22 


1769 


X71427 


Homo sapiens mRNA 
for FUS-CHOP 
protein fusion 


le-60 


746557 


(U23523) histidine-rich 
[Caenorhabditis elesans] 


0.45 


1770 


AFO 13988 


Homo sapiens serine 
protease mRNA, 
complete cds 


Ie-60 


2564316 


(AB006622) No similarities to 
any reported proteins [Homo 
sapiensl 


0.26 


1771 


U25691 


Mus musculus 
lymphocyte specific 
helicase mRNA, 
complete cds 


7e-6i 


2137490 


lymphocyte specific helicase - 
mouse musculus] 


3e-25 


!772 


X93334 


H.sapiens 

mitochondrial DNA, 
complete °enome 


4e-61 


70656 


ubiquitin / ribosomal protein 
S27a - human extension protein, 
HUBCEPS0 [human. Peptide, 
156 aa] ubiquitin extention 
protein [Cavia porcellusl 


9e-0S 


1773 


D3S255 


Homo sapiens mRNA 
for CAB I, complete 
cds 


4e-6l 


2135214 


sene MLN 64 protein - human 


4e-23 


1774 


U2569L 


Mus musculus 
lymphocyte specific 
helicase mRNA. 
complete cds 


8e-62 


2137490 


lymphocyte specific helicase - 
mouse musculus] 


Se-26 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Pre 


) terns) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1775 


M21731 


Human lipocortin-V 
mRNA. complete cds. 


6e-62 


3212603 


Human Annexin V With Proline 
Substitution By Thioproline 


ze-_u 


1776 


AF021936 


Rattus norvegicus 
myotonic dystrophy 
kinase-related Cdc42- 
binding kinase 
MRCK-beta (MRCK- 
beta) mRNA, 
complete cds 


2e-62 


2736153 


(AF021936) myotonic 
dystrophy kinase- related Cdc42- 
binding kinase MRCK-beta 
[Rattus norvesicus) 


3e-27 


1777 


Y 12059 


H.sapiens HUNKI 
mRNA 


le-62 


3184498 


(AC004798) R3 1546.1 [Homo 
sapiens] 


3e-09 


1778 


L37368 


Human (clone E5.1) 
RNA-binding protein 
mRNA. complete cds. 


6e-63 


477578 


sialidase - Actinomyces viscosus 
>si|141852 


7 <l 


1779 


M27877 


Figure 2. Nucleotide 
and translated protein 
sequences of HPF1, - 
2, and -9. 


5e-63 


1731443 


ZINC FINGER PROTEIN 83 
(ZINC FINGER PROTEIN 
HPFl)>gi|I06023|pir||A3289i 
finger protein I, placental - 
human 


3e-33 


1780 


AF095448 


Homo sapiens 
putative G protein- 
coupled receptor 


2e-63 


3116131 


(AL02328S) hypothetical 
protein 


4.6 


1781 


L 19437 


Human transaldolase 
mRNA containing 
transposable element, 
complete cds 


2e-63 


15:531 19 


(U63159) transaldolase [Mus 
musculus] 


4e-18 


1782 


L41351 


Homo sapiens 
prostasin mRNA, 
complete cds 


le-63 


2833277 


PROSTASIN PRECURSOR 
precursor - human >gi|862305 
(L4I35 1) prostasin [Homo 
sapiens] >gi| 1 143194 (U33446) 
prostasin [Homo sapiens] 


6e-14 


1783 


AF053470 


Homo sapiens lOkD 
protein (BC 10) 
mRNA. complete cds 


6e-64 


482237 


hypothetical protein K03H1.9 - 
Caenorhabditis elegans 


0.029 



hoi 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1784 


D37791 


Mouse mRNA for 
beta- 1 ,4- 

galactosyl transferase 


6e-64 


3880102 


similar to FYVEnrnr 

finger; cDNA EST yk265b4.5 
comes from this gene; cDNA 
EST yk35yg9.5 comes from this 
gene;cDNA EST yk319c2.5 
comes from this gene 
[Caenorhabditis elegans] zinc 
finger; cDNA EST yk265b4.5 
comes from this gene; cDNA 
EST yk359g9.5 comes from this 
gene; cDNA EST yk319c2.5 
comes from this gene 
[Caenorhabditis elegans] 


3e-16 


1785 


AF015770 


Mus musculus radical 
fringe (radical-fringe) 
mRNA, complete cds 


6e-64 


2204355 


(U94350) radical fringe 
precursor [Mus musculus] 


Ie-36 


1786 


Z79054 


H.sapiens flow-sorted 
chromosome 6 
Hindlll fragment. 
SC6pA21Ell 


2e-64 


<NONE> 


<NONE> 


<NONE> 


1787 


MS3094 


Homo sapiens 
cytosolic selenium- 
dependent glutathione 
peroxidase gene, 
complete cds, and 
rhohl2 sene. 3' end. 


le-64 


2447063 


(U42580) A565R [Paramecium 
bursaria Chiorella virus 1 1 


8.3 


1788 


Y10211 


H.sapiens LAG-3 
gene, promoter region 


7e-65 


1944540 


(X141 12) tegument protein 
[human herpesvirus 1] 


2.3 


1789 


M 19045 


Human lysozyme 
mRNA, complete cds. 


2e-65 


<NONE> 


<NONE> 


<NONE> 


1790 


U018S2 


Homo sapiens SS- 
A/Ro autoantigen 52 
kda component gene, 
complete cds 


2e-65 


585401 


LlPASHMOOULAlOk 

PRECURSOR (LLrAbc 
HELPER PROTEIN) 
>gi|480045ipir||S36249 lipB 
protein - Pseudomonas glumae 
>gi|49207 (X70354) helper 
protein 


4.2 


1791 


AF069517 


Homo sapiens RNA 
binding protein DEF- 
3 mRNA. complete 
cds 


2e-65 


3212101 


(AF069517) RNA binding 
protein DEF-3 (Homo sapiens] 


le-25 
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Nearest Neishbor 'BlastN vs. Gcnbank) 


Nearest Neiahbor (BlastX v s . Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


I DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Homo sapiens jerky 










1792 


AF004715 


gene product 
homolog mRNA. 
complete cds 


2e-65 


2314829 


(AHJ047I5) jerky gene product 
homolog [Homo sapiens 1 


2e-45 


1793 


X59652 


C. longicaudacus hprt 
mRNA for 
hypoxamhine 


3e-66 


631625 


hypo xanthine (guanine) 
phosphoribosyltransferase - long 
tailed hamster 
phosphoribosyltransferase 
[Cricetulus tonzicaudatus] 


6e-54 


1794 


U94350 


Mus musculus radical 
fringe precursor 
mRNA. complete cds 


3e-67 


2204355 


(U94350) radical fringe 
precursor [Mus musculus] 


2e-33 


1795 


AF015811 


Mus musculus 
putative 

lysophosphatidic acid 
acyltransferase 
mRNA. complete cds 


3e-68 


2317725 


(AF015811) putative 
lysophosphatidic acid 
acyltransferase [Mus musculusl 


7e-5l 


1796 


J03137 


Cow 

phosphoinositide- 
specific 

phospholipase C 


3e-69 


226908 


phospholipase C 154 [Bos 
t3urus] 


3e-25 


1797 


AF044574 


Rattus norvegicus 
putative peroxisomal 
2,4-dienoyl-CoA 
reductase (DCR- 
AKL) mRNA, 
complete cds 


le-69 


4105269 


(AF044574) putative 
peroxisomal 2,4-dienoyl-CoA 
reductase [Rauus norvegicus] 


2e-33 


1798 


AF015811 


Mus musculus 
putative 

ysophosphatidic acid 
acyltransferase 
mRNA. complete cds 


4e-70 


2317725 


(AF015811) putative 
ysophosphatidic acid 
acyltransferase (Mus musculus] 


3e-l9 


1799 


1 

X65I57 f 


Vl.musculus rnRNA 
or desmoyokin. 
martial 


5e-74 


109781 : 


desmoyokin - mouse (fragment) 
>ai|50675 


9e-37 


1800 


I 

Z97207 t 


vlus musculus mRNA 
or B-lNDi protein 


2e-74 


( 

2231019 r 


297207) B-IND1 protein [Mus 

TlUSCUlllSl 


6e-21 


1801 


( 
r 

U27196 |r 


jallus gallus zinc 
ingcr protein (Fzf-1) 
nRNA, complete cds. 


6e*75 


( 

984814 


U27196) zinc Finger protein 
Gallus gallus! gallus] 


?e-44 



lot 
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Nearest Neighbor (BlasuN vs. Genbank) 


Nearest Neighbor (BlastX vs. No n- Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












70 KD WD- REPEAT TUMOR- 




1802 


Y 15054 


Rattus norvegicus 
mRNA for 70 kDa 
tumor specific 
antigen, partial 


3e-77 


3123027 


SPECIFIC ANTIGEN 
>gi|2505957|gnI|P[D|e353992 
(Y 15054) 70 kD tumor-specific 
antigen [Rattus norvegicus] 


4e-42 


1803 


X65157 


M.musculus mRNA 
for desmoyokin. 
partial 


3e-79 


109781 


desmoyokin - mouse (fragment) 
>gi|50675 


9e-33 


1804 


U50736 


Rattus norvegicus 
cardiac adriamycin 
responsive protein 
mRNA. complete cds 


2e-84 


1362781 


cytokine inducible nuclear 
protein C 193 - human 
>gi|793841 (XS3703) nuclear 
protein [Homo sapiens] 


7e-30 


1805 


AF072865 


Rattus norvegicus 
thioredoxin reductase 
(TrxR2) mRNA. 
nuclear gene 
encoding 
mitochondrial 
protein, complete cds 


2e-84 


3757888 


(AF072865) thioredoxin 
reductase [Rattus norvegicus] 


6e-43 


L806 


AF044574 


Rattus norvegicus 
putative peroxisomal 
2 f 4-dienoyi-CoA 
reductase (DCR- 
AKL) mRNA. 
complete cds 


6e-85 


4105269 


(AF044574) putative 
peroxisomal 2,4-dienoyl-CoA 
reductase [Rattus norvegicus] 


le-41 


1807 


U19181 


Rattus norvegicus 
Rabin3 mRNA, 
complete cds. 


2e-S7 


624225 


(U191S1) Rabin3 [Rattus 
norvegicus] 


2e-41 


1808 


U40342 


Mus musculus ninein 
mRNA. complete cds. 


le-9l 


1113865 


(U40342) ninein [Mus 
musculus] 


2e-36 


1809 


X67877 


R.norvegicus mRNA 
for cytosolic 
resiniferatoxin- 
binding protein 


4e-92 


136077 


TROPOMYOSIN BETA 3, 
FIBROBLAST chicken 
>gi|5 15694 (M23082) 
tropomyosin [Gallus gallus] 


0.56 


1810 


AF044574 


Rattus norvegicus 
putative peroxisomal 
2,4-dienoyl-CoA 
reductase (DCR- 
AKJL) mRNA. 
complete cds 


5e-93 


4105269 


(AF044574) putative 
peroxisomal 2,4-dienoyl-CoA 
reductase [Rattus norvegicus] 


le-50 


1811 


AF035527 


Mus musculus EHF 
(Ehf) mRNA. 
complete cds 


2e-95 


3138930 


(AF035527) EHF [Mus 
musculusl 


2e-47 



1>0 4 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1812 


ABO 16930 


Cricetulus gnscus 
mRNA for 

Phosphatidyl glycerop 
hosphate synthase, 
complete cds 


6c-96 


4159682 


(ABO 16930) 

Phosphatidyl glycerophosphate 
synthase [Criceculus griseusl 


7e-4l 


1813 


AB005549 


Rattus norvegicus 
mRNA for atypical 
PKC specific binding 
protein, complete cds 


7e-97 


3868778 


(AB005549) atypical PKC 
specific binding protein [Rattus 
norvegicus 1 


3e-41 


1814 


X90849 


G.aailusPBl gene 


2e-97 


2134381 


polybromo 1 protein - chicken 
chicken >gi|95123l (X90849) 
polybromo 1 protein (Gailus 
gallus) 


Ie-34 


1815 


S79873 


h-Iamp-2=lysosome- 
associated membrane 
protein-2 protein-2b 
(LAMP?) mRNA, 
alternatively spliced 
form h-Iamp-2b, 
complete cds. 


3e-98 


<NONE> 


<NONE> 


<NONE> 


1816 


U67203 


Mus musculus ACF7 
neural isoform I 
(mACF7) mRNA, 
partial cds 


2e-98 


1675224 


(U67204) ACF7 neural isoform 
2 [Mus musculus] 


9e-39 


1817 


L 14684 


Rattus norvegicus 
nuclear-encoded 
mitochondrial 
elongation factor G 
mRNA, complete cds. 


e-100 


585084 


ELONGATION FACTOR G, 
MITOCHONDRIAL 
PRECURSOR (MEF-G) 
>gi|543383|pir||S407S0 
translation elongation factor G, 
mitochondrial - rat >gi|3 10102 


2e-30 


1818 


X84692 


M. musculus Spnr 
mRNA for RNA 
binding protein 


c-133 


1363238 


spermatid perinuclear RNA- 
binding protein Spnr - mouse 
>gi|673454 (X84692) spermatid 
perinuclear RNA binding 
protein [Mus musculus] 


5e-35 


1819 


U50736 


Rattus norvegicus 
cardiac adriamycin 
responsive protein 
mRNA. complete cds 


e-i 13 


1362781 


cytokine inducible nuclear 
Drotein C 193 - human 
>gi|79384l (X83703) nuclear 
protein [Homo sapiens] 


2e-36 


1820 


S66855 


HoxB9=Hox-2.5 
mice, embryos, 
mRNA Partial, 786 
ml 


e-107 


1708355 


HOMEOBOX PROTEIN HOX- 
B9 (HOX-2.5) 


Se-37 




WO 01/02568 



PCT/US00/18374 



Nearest Neighbor (BlasiN vs. Genbank) 



SEQ 

ED I ACCESSION 



DESCRIPTION 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



P VALUE I ACCESSION 



DESCRIPTION 



P VALUE 



HoxB9=Hox-2.5 



182i| S66855 



[mice, embryos, 
mRNA Partial, 786 
nil 



e-108 



1708355 



HOMEOBOX PROTEIN HOX 
B9 (HOX-2,5) 



4c-37 



1822 1 U92Q72 



Rartus norvegicus m- 
tomosyn mRNA, 
complete cds 



e-102 



3790389 



(U92072) m-tomosyn [Rattus 
norvegicus] 



2e-38 



1823 | D 17577 



Mouse mRNA for 
kinesin-like protein 
(Kiflb). complete cds 



e-129 



2497524 



KINESIN-LIKE PROTEIN 
KIFIB mouse 

>gi|407339|gnI|P[D|d 1005029 
(DI7577) KifIb[Mus 
muscuius] 



2e-39 



1824| AF062484 



Mus muscuius SDP8 
mRNA, complete cds 



e-122 



3126981 



(AF062484) SDPS [Mus 
muscuius] 



5e-40 



1825 1 X73683 



R.norvegicus mRNA 
for histone H3.3 



e-109 



122075 



(H3.3Q) histone H3.3 - fruit fly 
(Drosophila melanogaster) 

istone H3.3B - chicken 
>gi|2ll9023|pir||S612iS histone 
H3.3 - fruit fly (Drosophila 
hydei) 1-136 1 ) [Oryctolagus 
cuniculusj >gijS046 (X53S22) 
Histone H3.3Q gene product 
[Drosophila melanogaster] 
>gi|5L 198 gallus] >gi|l61190 
(MI 7876) histone H3 [Spisula 
solidissima] >gi|2U853 
(Mi 1393) histone 3.3 [Gallus 
gallus] >gi|306S4S (Ml 1354) 
H3.3 histone [Homo sapiens] 
melanogaster] >gi|96303 I 
(XS 1205) histone H3.3H3.3A 
variant [Drosophila 
melanogaster] musculusl 



2e-40 



18261 U67203 



Mus muscuius ACF7 
neural isoform I 
(mACF7) mRNA, 
partial cds 



e-102 



1675224 



(U67204) ACF7 neural isoform 
2 [Mus muscuius 



1827[ D 17577 



Mouse mRNA for 
kinesin-like protein 
(Kiflb). complete cds 



c-13I 



2497524 



KlNtSIN-Likt t>ROTEIN 
KIFIB mouie 

'>gi|407339J^nl|PID|d 1005029 
(DI7577) K::lb[Mus 
muscuius] 



2e-40 



7e-42 



?>fo 



WO 01/02568 



PCT/US00/18374 





Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


pen 

ID 


ACCESSIOh 


I DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1828 


AB016930 


Cricetulus griseus 
mRNA for 

Phosphatidylglycerop 
hosphate synthase, 
complete cds 


e-13l 


4159682 


(AB016930) 

Phosphatidyl glycerophosphate 
synthase [Cricetulus griscusl 


3e-43 


1829 


U09874 


Mus musculus SKD3 
mRNA, complete cds 


e-122 


2493735 


SKD3 PROTEIN SKD3 [Mus 
musculus] 


7e-48 


1830 


X99145 


C.familiaris mRNA 
forC3VS protein 


e-110 


1429314 


(X99145) overexpressed in 
thyroid tissue after TSH 
stimulation [Canis familtaris] 


2e-49 


1831 


X99836 


P.walti mRNA for 
rnp associated protein 
55 


e-106 


4200286 


(X99836) rap55 [Pleurodeles 
waltl] 


2e-50 


1832 


AF077003 


Mus musculus SH3 
domain-containing 
adapter protein 
mRNA. complete cds 


e-121 


3550240 


(AF077003) SH3 domain- 
containing adapter protein; 
CD2AP 


3c-51 


1833 


AF060246 


Mus musculus strain 
C57BL/6 zinc finger 
protein 106 (Zfpl06) 
mRNA, H3a-a allele, 
complete cds 


e- 1 IS 


3372657 


(AF060246) zinc finger protein 
106 [Mus musculusj 


le-52 


1834 


Z14030 


R.norvegicus mRNA 
For TRAP-compiex 
gamma subunit. 


e-120 


1174453 


ikainolULUjV 
ASSOC1ATED PROTEIN, 
GAMMA SUBUNIT (TRAP- 
GAMMA) (SIGNAL 
SEQUENCE RECEPTOR 
GAMMA SUBUNIT) (SSR- 
GAMMA) 

>gi|423185|pir||S33294 
translocon-associated protein 
eamma chain - rat norveaicus] 


7e-54 


1835 


< 

AF077003 i 


Mus musculus SH3 
domain-containing 
adapter protein 
tiRNA. complete cds 


e-132 


< 
< 

3550240 ( 


AF077003) SH3 domain- 
rontaining adapter protein; 
:D2AP 


5e-54 


1836 


1 
< 

r 

L20427 r 


^attus norveaicus 

Jihydroxypolyprenyib 

mzoate 

nethyl transferase 
nRNA. complete cds 


e-116 


( 
c 
r 

c 
r 

457372 r 


LU427) 

1 i hydro xypo I y prenyl be nzoate 
nethyltransferase 
lihydroxypoiyprenylbenzoate 
nerhyltransrerase [Rattus 
orvegicus) 


4c-56 



3u 



WO 01/02568 
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Nearest Neighbor fBlastN vs. Genbank) | 


Nearest Neighbor (BlastX vs. Non- Redundant Proteins) 


ID 


> 

ACCESSION 


( DESCRIPTION 


P VALUE I 


ACCESSION 


DESCRIPTION 


P VALUE 










PROTEIN TSG24 (MEIOTIC 




1837 


X80169 


M.musculus mRNA 
for 200 kD protein 


e-122 


1717793 


CHECK POINT 
REGULATOR) 
>gi|1083553|pir||A55 1 17 tss24 


2e-56 


1838 




Raitus norvegicus 

CTP:phosphoethanoI 

amine 

cytidylyltransferase 
mRNA. complete cds 


el 19 1 


3396102 


(AF080568) 

CTP:phosphoethanolamine 
cytidylyltransferase 


6e-5S 


1839 


X99145 


Cfamiliaris mRNA 
for C3VS protein 


e-l2i 


1429314 


(X99145) overexpressed in 
thyroid tissue after TSH 
stimulation [Canis familiaris] 


2e-5S 


1840 


AFO 19075 


Pan troglodytes breast 
and ovarian cancer 
susceptibility 
(BRCAl)gene, 
partial cds 


e 145 


2218154 


( AFO05O6S) breast and ovarian 
cancer susceptibility protein 
splice variant [Homo sapiens] 


le-5S 


1841 


U55042 


Bos taurus myosin X, 
complete cds 


e-122 


. 1755049 


(U55042) myosin X [Bos 
taurus] 


le-61 


1842 


AJ007780 


Mus musculus mRNA 
tor poly( ADr-noose) 
poIymerase-2 


e-U9 


3283975 


(AF072521) poly-(ADPribosyl)- 
transferase homoloa PARP 


4e-62 


1343 




Rattus norvegicus 
thioredoxin reductase 
(TrxR2) mRNA, 
nuclear gene 
encoding 
mitochondrial 
protein, complete cds 


e-105 1 


3757888 


(AF072865) rhioredoxin 
reductase [Rattus norvcaicus] 


3e-62 


1844 


U55042 


Bos taurus myosin X, 
complete cds 


eI21 


1755049 


(U55042) myosin X [Bos 
taurus] 


Ie-62 


1845 


X61506 


Mouse E46 mRNA 
for E46 protein 


e 139 


114909 


BRAIN PROTEIN E46 


9e-67 


1846 


I 
( 

D90335 3 


Bovine mRNA for 
UTP-binding protein 
Llpha-subunii 


e-148 


1 

585174 


JUAMNfc NUCLfcOTIDfc- 
BINDING PROTEIN, ALPHA- 
14SUBUNIT (GL1) 
>gi|l0871 l!pir||A40S91 GTP- 
sinding protein GL1 alpha chain 
bovine protein, alpha-subunit 
Bos taurus' 


2e-69 


1847 


I 

U49507 r 


yfus musculus 
J6CBA Lisch7 
nRNA. partial cds. 


e-140 1 


( 

2121326 s 


AC00212S) Lisch7 [Homo 
apiens] 


2e-74 
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Table 4 





Nearest Neighbor f BlastN vs. Genbank) 


Nearest Neighbor iBIastX vs. Non- Redundant Proteins) 


CCA 

ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 


1 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


2 


<N0NE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


3 


<N0NE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


4 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


5 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


6 


<N0NE> 


<NONE> * 


<N0NE> 


<NONE> 


<NONE> 


<NONE> 


7 


<NONE> 


<NONE> 


<NONE> 


<NONE> < 


<NONE> 


<NO>fE> 


S 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


9 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


to 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


11 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


12 


<NONE> 


. <NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


13 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


14 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


15 


<N0NE> 


<NONE> 


<N0NE> 


<NONE> 


<NONE> 


<NOrsTE> 


16 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


17 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


18 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


19 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


20 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


21 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


22 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


23 


<iNONE> 


<NONE> 


<NONE> 


1079469 


dMDC I protein - crab-eating 
macaque 


9.3 


24 


<NONE> 


<NONE> 


<NONE> 


3043656 


(AB0U13S) KIAA0566 protein 
[Homo sapiensl 


9.3 


25 


<NONE> 


<NONE> 


<NONE> 


112175 


potassium channel protein RK5 - 
rat protein [Ratius norvegicus] 


8.6 


26 


<NONE> 


<NONE> 


<NONE> 


3769624 


(AF091565) olfactory receptor 
[Rattus norvesicus] 


7.2 


27 


<NONE> 


<NONE> 


<NONE> 


3876443 


(Z815I7) F2SB1.6 
(Caenorhabditis elegans] 


7.l' 


28 


<NONE> 


<NONE> 


<NONE> 


2224464 


(AB001684) ORF249 [Chloreila 
vulgaris] 


6.9 


29 


<NONE> 


<NONE> 


<NONE> 


1519707 


(U67940) ORFveglOfc; random 
cDNA sequence [Dictyostelium 
discoideuml 


6.7 


30 


<NONE> 


<NONE> 


<NONE> 


227491 


protein kinase C II [Xenopus 
laevis] 


6.7 


31 


<NONE> 


<NONE> 


<NONE> 


630575 


C50C3.4 protein - 
Caenorhabditis eiegans 


6.0 












3^f)PROTt[.NlNRN J A2 




32 


<NONE> 


<NONE> 


<NONE> 


137290 1 


clover necrotic mosaic virus 
>ai|61466 (X0S0:O ORF tor 35 
kDa polypeptide I AA 1-317/ 
[Red clover necrotic mosaic 
virus] 


6.0 



3^3 



WO 01/02568 



PCT/US00/18374 



Nearest Neighbor (BlastN vs. Genbank) 



IP I accession! DESCRIPTION 



33 | <NONE> 



34 
35 



<NONE> 
<NONE> 



P VALUE 



<NONE> 



<NONE> 



<NONE> 



Nearest Neigh bor (BlastX vs. Non- Redundant Pro iei ns) 
ACCESSION | DESCRIPTION | p VALI JKl 



<NONE> 



30041 



(X16711)pid:g30O41 fHomo 
(sapiens 1 



CELL DIVISION PROTEIN 
<NONE> | 2493585 FTSW 



<NONE> 



1001450 [(D63999) hypothetical protein" 



5,9 



5.7 



36 | <NONE> 



37 1 <NONE> 



38 | <NONE> 



39 | <NONE> 



40 | <NONE> 



<NONE> 



NITROGEN REGULATORY 
<NONE> | 3182918 IPROTEINAREA 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



140011 



Mil'fJcHONDRlAL 

RIBOSOMAL PROTEIN S5 
|Emericella nidulans 
mitochondrion (SGC3) 
>gi|I2709 nidulans] >gi|472822 
(JO 1390) unknown protein 



<NONE> 



3979943 



UALUMm) predicted using 
jGenefinder; similar to WD 
[domain, G-beta repeat; cDNA 
EST yk362f7.5 comes from this 
gene; cDNA EST yk362f7.3 
comes from this gene 
[Caenorhabditis elegans] 



(U31329) polyketide synthase 
<NONE> I 950203 [Aspergillus terreusl 



<NONE> 



<NONE> | 3560232 



(AL031530) hypothetical zinc 
(finger protein 

[[Schizosacc haro myces pombe] 



5.2 



4.3 



4.0 



3.3 



3.0 



41 I <NONE> 



42 | <NONE> 



<NONE> 



<NONE> I 730071 



I AXONEME- ASSOCIATED 
PROTEIN MST101(1) product 



[[Drosophila hydei] 

I Ml ^'i'^. 1 1 



<NONE> 



<NONE> 1 2506641 



HVk>YMeTic'ALii.7 kd 

PROTEIN IN INTE-PIN 
INTERGENIC REGION 
>gi| 1787402 (AE000214) orf, 
hypothetical protein 
[Escherichia coli] 



2.6 



2.5 



43 | <NONE> 



44 | <NONE> 



45 I <NONE> 



<NONE> 



<NONE> I 35 1 1232 



(AF071556) anthranilate 
dioxygenase large subunit 



<NONE> 



<NONE> | 1150900 



(U43139) envelope glycoprotein 
gpl20 [Human 

immunodeficiency virus type 1 



<NONE> 



<NONE> I 3876099 



(Z75536) similar to dynein 
heavy chain; cDNA EST 
EMBL:D27549 comes from this 
gene; cDNA EST 
|EMBL:D34859 comes from this 
Jgene [Caenorhabditis elegans] 



2.4 



1.9 



1.4 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 



46 



47 



48 



Nearest Neighbor (BlastN vs. Genbank) 



ACCESSION 



<NONE> 



'<NONE> 



<NONE> 



49 I <NONE> 



DESCRIPTION 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



P VALUE 



<NONE> 



<NONE> 



Nearest Neighbor (BlastX vs^ Non-Redundant Proteins) 



ACCESSION 



3881150 



Genefinder 



132200 



<NONE> 



2204286 



DESCRIPTION 



P VALUE 



(AL032647) predicted using 



CULAiNIC ACID CAt^SutAR 
BIOSYNTHESIS 
ACTIVATION PROTEIN A 
>gi|95605|pir||S 17701 res A 
protein 



[U61380) germination protein 
[Bacillus megaterium 



<NONE> 



1723955 



HKFUlHUlLAL 11.4 KL> 
PROTEIN IN FOX1-KEX1 
INTERGENIC REGION 
>gi|2132566|pir||S64222 
probable membrane protein 
YGL204c - yeast 
(Saccharomyces cerevisiae) 
>gi|l322838|gnl|PED|e243803 
(Z72726) ORF YGL204c 
[Saccharomyces cerevisiae] 



1.4 



1.1 



1.0 



50 | <NONE> 



31 I <NONE> 



52 | <NONE> 



53 I <NONE> 



<NONE> 



<NONE> 



3201564 



(AJ006514) prolipoprotein 
diacylgiyceryl transferase 
Vibrio choleracl 



<NONE> 



<NONE> 



2808721 



(AL021428) hypothetical 
protein RvQ064 



<NONE> 



<NONE> 



602434 



(U17986) GABA/noradrenaline 
transporter [Homo sapiens] 



<NONE> 



<NONE> 



3347955 



(AF076184) cytosolic sorting 
protein P ACS- lb [Rattus 
norvegicus] 



0.31 



0.27 



0.13 



54 I <NONE> 



55 | <NONE> 



56 | <NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



I2558S7 



coded for by L 
elegans cDNA yk92b4.5: coded 
for by C. elegans cDNA 
yk73al.5; coded for by C. 
elegans cDNA ykl02e9.5; 
coded for by C. elegans cDNA 
yk7Ic8.5; coded for by C. 
elegans cDNA yk66dll.5; 
coded for by C. elegans cDNA 
yk66c3... 



<NONE> 



103076 



<NONE> 



107560 



3km-Iike sex-determining 
region hypothetical protein 
CS3I4 - fruit fly (Drosophila 
melanogaster) 



0.074 



0.003 



Ras inhibitor (clone JC265) ■ 
human sapiens! 



o.oo: 



WO 01/02568 



PCT7US00/18374 





Nearest Neighbor (BlasiN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundnm Proteins i 


SEQ 
ID 


ACCESSION 


1 DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












B km- like sex-determmins 




57 


<NONE> 


<NONE> 


<NONE> 


103076 


region hypothetical protein 
CS3I4 - fruit fly (Drosophila 
melanogaster) 


2c-04 


58 


<NONE> 


<NONE> 


' <NONE> 


2702370 


(AF038604) contains similarity 
■ to Drosophila ovarian tumor 
locus protein (GB:X13693) 
[Caenorhabditis elesans] 


6e-05 


59 


<NONE> 


<NONE> 


<NONE> 


3859713 


(AL033501) phox domain 
protein [Candida albicans] 


3e-05 


60 


<NONE> 


<NONE> 


<NONE> 


2088839 


(AF003386) F59E12.5 gene 
product [Caenorhabditis 
elegans] 


2e-08 


61 


<NONE> 


<NONE> 


<NONE> 


121059 


GC-RICH SEQUENCE DNA- 
BINDING FACTOR GCF - 
human >gi|179412 (M29204) 
DN A- binding factor [Homo 
sapiens] 


4e-09 


62 




<NUNh> 


<NONE> 


3875246 


domain, G-beta repeats (2 
uurnainsj, t u i> /\ co i 
EMBL:T00482 comes from this 
gene; cDNA EST 
EMBL;T00923 comes from this 
gene; cDNA EST yk449d4.3 
comes from this gene; cDNA 
EST yk449d4.5 comes from this 
gen... 


9e-24 


63 


<NONE> 


<NONE> 


<NONE> 


1465834 


(U64857) No definition line 
found [Caenorhabditis elegans] 


9e-2S 


64 


<NONE> 


<NONE> 


<NONE> 


3327136 


(AB014561) KIAA0661 protein 
[Homo sapiens] 


le-29 


65 


<NONE> 


<NONE> 


<NONE> 


3880433 


(Z66521) similar to 
mitochondrial RN A splicing ' 

EMBL:C09217 comes from this 
gene [Caenorhabditis elegans] 


8e-31 


66 


] 

D42133 , 


Rat annex in V gene, 
exon7 and exon8 


5.0 


<NONE> 


<NONE> 


<NONE> 


67 


i 

( 
< 

L35679 I 


Hlomo sapiens 
subclone H8 2_dll 
YomPI 35H5C8) 
DNA sequence. 


5.0 


< 
1 
< 
{ 
1 
i 

1086902 c 


[WU/H) coded tor by <J. 
jlegans cDNA yk79g8.5; coded 
for by C. elegans cDNA 
:m 10c8; coded for by C. elegans 
:DNA yk79g8.3; similar to 
eucine.-rich repeats found in 
nany proteins [Caenorhabditis 
:Iegans] 


6.6 



WO 01/02568 



PCT/US00/18374 





Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. No n- Redundant Protein** 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






HIV- 1 strain BX220 










68 


U90I84 


from USA, envelope 
glycoprotein C2V3 
region (env) gene, 
partial cds 


5.0 


1297070 


(Z71986) convicilin precursor 
[Vicia narbonensis] 


6.6 


69 


U61465 


Human myosin Vila 
(MY07A) gene, 5' 
exon 37 


5.0 


2313225 


(AE000535) L-lactate permease 
(IctP) [Helicobacter pylori 
26695] 


5.0 


70 


AF013717 


Homo sapiens 
peripiakin (PPL) 
mRNA, partial cds 


5.0 


' 3719238 


(AF064869) brain-enriched 
guanylate kinase -associated 
protein 2; BEGA2 [Rattus 
norvegicusl 


3.8 


71 


X58245 


Soybean mRNA for 
HMG-1 like protein 


5.0 ' 


2995363 


(AL022245) biofin synthase 


0.99 


72 


AF I 02425 


Frasera paniculata 
tRNA-Leu (tmL) 
gene, intron, 
chloropiast sequence 


4.9 


3522958 


(AC004411) putative 
pectinesterase (Arabidopsis 
thai i ana] 


6.4 


73 


X82817 


H.sapiens 

PTPlC/HCP-variant 
gene 


4.9 


3875514 


EMBL:D27474 comes from this 
gene; cDNA EST 
EMBL:D27473 comes from this 
gene; cDNA EST 
EMBL:T00471 comes from this 
gene; cDNA EST 
EMBL:D34192 comes from this 
gene; cDNA EST 
EMBL:D37241 comes from this 
gene; ... 


2.8 


74 


U04827 


VI us musculus brain 
fatty acid-binding 
protein 


4.9 


• 

■ 

3676132 


(AL031765) 1- 

evidence=predicted by content; 
l-method=genefinder;084; 1- 
method_score=31.96; 1- 
evidence_end; 2- 
evidence=predicted by match; 2- 
match_accession=SPTREMBL: 
Q93319;2- 

match_description=HYPOTHE 
TICAL PROTEIN C33A1 1.2.;... 


2e-09 


75 


] 

< 
1 

AF038859 c 


Neospora hughesi 
strain NE1 interna! 
ranscribed spacer 1, 
romplete sequence 


4.8 


<NONE> 


<NONE> 


<NONE> 



2*7 



F 

T 

/ 

i 



f 



WO 01/02568 



PCT/US00/18374 





Nearest 


Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (Blast* v « Nnn-RpHnnr^nt P™r*;„^ 1 


. SEC 
ID 


1 

ACCESSIOr 


>f DESCRIPTION 


1 P VALUE 


ACCESSION 


DESCRIPTION 


p value! 


"~76 


Y08222 


Mmusculus MFH-1 
gene 


4k8 


<NONE> 


<NONE> 


" <NONE>| 


77 


AJ224475 


Borrelia burgdorferi 
left chromosomal 
subtelomeric region 
(pfpB gene) 


4.8 


4218141 


(AJ236702) HMR1 protein 
- [Antiahinum majus] 


8.3 1 


78 


U02486 


Mus musculus LAF 
putative membrane 
protein (KRAG) 
gene, exon 3 and 
complete cds 


4.8 


3258103 


(AP000006) 367aa long 
hypothetical protein 
[Pyrococcus horilcoshii] 


2.7 I 


79 


ABO0O28O 


Rat mRNA for 
peptide/histidine 
transporter, complete 
cds 


4.8 


806317 


(M29067) unknown protein 
Saccharomvces cerevisiael 


0.001 |. 


80 


Z49771 


A.cepa mitochondrial 
gene for NADH 
dehydrogenase 
subunit 3 and 
ribosomal protein 
S12 


4.5 


<NONE> 


<NONE> 


<NONE> I 


SI 


M63494 


Mouse IgG receptor 
(beta-Fc-gamma-RII) 
gene, exons 6 and 7, 
clones lambda- 
Fc(3.2,93). 


4.3 


<NONE> 


<NONE> 


<NONE>[ 


82 


Z 14035 


S.pombe carl gene 


2.0 


3790665 


(AF09900O) No definition line 
found [Caenorhabditis elegans] 


1.2 


83 


UI7129 


Rhodococcus 
erythropolis ThcA 
(thcA) gene, complete 
sds; and unknown 
eenes 


2.0 


< 

2828280 


CAL021687) putative protein 
Arabidopsis thaliana] 
>gi |2 83263 3 |gn l|PID|e 1 24965 1 
[AL02171 1) putative protein 
'Arabidopsis thaliana] 


2e-26 


84 


( 
s 
t 

AE001386 s 


(1311 lUUIUIll 

"alciparum 
:hromosome 2, 
ection23 of 73 of 
he complete 
equence 


2.0 


( 

4176500 


AL031177) dJ889M15.3 (novel 
>rotein) 


9e-59 


85 


1 

U79292 r, 


iuman clone 23734 
nRNA sequence 


1.9 


<NONE> 


<NONE> 


<NONE> 


86 


C 

2 
f 

V0O159 r 


Ihloroplasi Euglena 
racilis gene coding 
or the 5S and 16S 
RNA. 


1.9 1 


<NONE> 


'■ <NONE> 


cNONE> 1 



bit 



WO 01/02568 



PCT/US00/18374 



Nearest Neighbor f BlastN vs. Genbank) 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



SEQ 

H> [ACCESSION ] DESCRIPTION 
Xenopus laevis XL- 
INCENP (XL- 
INCENP) mRNA, 
U95094 complete cds 



87 



P VALUE 1 ACCESSION 



1.9 



<NONE> 



DESCRIPTION 



<NONE> 



p value! 



88 



X93206 



H salinarium TATA 
box-binding protein 
genes and ORFs 



1.9 



<NONE> 



<NONE> 



<NONE> 



89 \ U60979 



90 | X56272 



91 I L22383 



92 | U82814 



93 | U 18504 



Caenorhabditis 
etegans programmed 
ceil death specifier 
(ces-2) gene, 
complete cds 



1.9 



C. tentans ORFs (A- 
E) for hemoglobin 



<NONE> 



<NONE> 



<NONE> 



1.9 



Homo sapiens DNA 
sequence, repeat 
reaion. 



<NONE> 



<NONE> 



<NONE> 



1.9 



Hirudo medicinalis 
neuron-specific 
protein mRNA, 
complete cds 



<NONE> 



<NONE> 



<NONE> 



Hapiomitrium 
hookeri 18S rRNA 
gene, panial 
sequence. 



1.9 



(AF094531) immunoglobulin 
3822533 [heavy chain precursor 



2.0 



hypothetical protein 6 - fowlpox 
1-9 | 1083969 |virus virus] 



94l__X53676 



95 



96 



97 



Pseudomonas stutzeri 
nosDFY genes 
involved in copper 
processing 



U60086 



U33447 



Dictyostelium 
discoideum multidrug 
resistance 
transporter/Ser 
protease (tagC) 
mRNA, complete cds. 



Human putative G- 
protein-coupled 
receptor (GPR 1 7) 
gene, complete cds 



1.9 



298078 1 (AL022 198) putative protein 



0.70 



1.9 



3879530 



(Z49130) cDNA EST 
yk486b9.3 comes from this 
gene; cDNA EST yk4S6b9.5 
comes from this gene 



6e-05 



M81327 



Sus scrofa lactoferrin 
mRNA, complete cds. 
>:: gb|I2S42l|I2842l 
Sequence 5 from 
patent US 5571691 



1.9 



3880034 



|(Z75550) similar to cell division 
control protein [Caenorhabditis 
lelegans] ~ ' 



7e-14 



1.8 



<NONE> 



<NONE> 



<NONE> 



WO 01/02568 



PCT/US00/18374 







Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSIOI^ 


i DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






S.iniae lctP& IctO 










98 


Y07622 


genes and ORF1 


1.8 


<NONE> 


<NONE> 


<NONE> 


99 


M60474 


Mouse myristoylated 
alanine-rich C-kinase 
substrate (MARCKS) 
mRNA, complete cds 


1.8 


<NONE> 


<NONE> 


<NONE> 


100 


Y 13901 


Homo sapiens FGFR- 
4 gene 


1.8 


<NONE> 


<NONE> 


<NONE> 


101 


U44400 


Human Down 
Syndrome region of 
chromosome 21, 
clone A31D6-1D6. 


1.8 ' 


<NONE> 


<NONE> 


<NONE> 


102 


U92808 


Ruminococcus albus 
beta-glucosidase 
(gluA) mRNA, 
complete cds 


1.8 


<NONE> 


<NONE> 


<NONE> 


103 


L25051 


Candida albicans 
argininosuccinate 
lyase (ARG4) gene, 
complete cds. 


1.8 


<NONE> 


<NONE> 


<NONE> 


104 


AE000546 


Helicobacter pylori 
26695 section 24 of 
134 of the complete 
genome 


1.8 


<NONE> 


<NONE> 


<NONE> 


105 


J00978 


Xenopus laevis major 
beta-globin gene, 
complete cds. 


1.8 


<NONE> 


<NONE> 


<NONE> 


106 


U4I716 


Human 

immunodeficiency 
virus type 1 isolate 
1W95-5 vnr cene 

i »r V *J J f VLSI KvllV) 

complete cds. 


1.8 


<NONE> 


<NONE> 


<NONE> 


107 


X66286 


G.gallus mRNA for 
ensin 


1.8 


<NONE> 


<NONE> 


<NONE> 


108 


] 

U76636 c 


Xenopus calbindin 
D28k mRNA, 
:omp!ete cds 


1.8 


<NONE> 


<NONE> 


<NONE> 


109 


T 

J00664 < 


abbit embryonic beta 
l-globin gene. 


1.8 


<NONE> 


<NONE> 


<NONE> 


110 


1 
( 

M21535 r 


-luman erg protein 
ets-related gene) 
nRNA. complete cds. 


1.8 


( 

2983160 r. 


AE000693) hypothetical 
>rotein [Aquifex aeolicus] 


7.7 




WO 01/02568 



PCT/US00/18374 



SEQ 
ID 

111 


Nearest 
ACCESSlOf 

M80829 


Neighbor (BlastN vs. 

j DESCRIPTION 
Rat troponin T 
cardiac isoform gene 
complete cds 


Genbank) 
1 P VALUE 

1.8 I 


Nearest Neieh 
ACCESSION 

999450 


bor (BlastX vs. Non-Redundant F 

DESCRIPTION 
_ (Z46595) incomplete interleuki 

1 1 receptor isoform [Homo 
__ sapiens] 


*rotei ns) 

_ P VALUE 

V 

7.3 


112 1 


D37887 


Cyprinus carpio c- 
myc gene for c-Myc, 
complete cds 


1.8 J 3023408 


BRANCHED-CHAIN AMINO 
ACID TRANSPORT SYSTEM 
CARRIER PROTEIN 
(BRANCHED-CHAIN AMINO 
ACID UPTAKE CARRIER) 
>gi|1075007|pir||D64056 
membrane-associated 
component, branched amino 
acid transport system (brnQ) 
homolog - Haemophilus 
influenzae (strain Rd KW20) 

SV^tem TT nrri#»r nm^in fkm/~U 

j/oL^ui ii csuiicr pruiein ^.ornwj 
[Haemophilus influenzae Rd] 


7,2 


113 AF019765 


Homo sapiens u 
protein-coupled 
receptor kinase 1 and 
G protein-coupled 
receptor kinase I b 
(GRKl)gene, 
alternatively spliced, 
alternative exon 6, 
cxon 7, and partial 
cds 


1.8 J 498643 


(U10270) G-box binding factor 
1 [Zea maysj 


7.2 


1 
I 

I 1 

1 c 

i 14 1 AF025967 f 


Helicobacter pylori 
J166 virulence 
regulon 
transcriptional 
activator homolog 
»ene> partial cds, 
itrain-specitlc 
jenomic sequence B2 


1 ( 
1 t 

1-8 1 3850108 


AL033388) putative calcium- 
ransporting atpase 
Schizosaccharomyces pombel 


5.7 


I > 

c 

115 1 U13I83 c 


Cenopus laevis 
Xwnt-4) raRNA, 
omplete cds. 


j 

1 I 
1 C 

( 

1 P 

1 p 

1 h 

18 | 2494853 h 


J KUBABLii"" 

HYDROXY ACYLGLUTATHI 
DNE HYDROLASE 
GLYOXALASE II) (GLX D) 
rotein [Escherichia coli] 
gi| 1 786406 (AE000130) 
robable 

ydroxyacylglutathione 
ydrolase [Escherichia colt] 


5.5 



318 



WO 01/02568 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















116 


S68944 


Na+/Cl(-)-dependent 

neurotransmitter 

transporter 


1.8 


2276316 


(296810) GLYT- 1 LIKE [Homo 
sapiens] 


5.5 


117 


M92905 


Rat calcium channel 
alpha- 1 subunit(rbB- 
I) rriRNA, complete 
cds. 


1.8 


3165522 


(AF067607) Similar to cuticular 
collagen; C18H7.3 


5.5 


IIS 


X 12429 


Xenopus laevis Ul 
70K gene exon 10 


l.S 


2735957 


(AF015685) reverse 
transcriptase domain protein 


3.3 


119 


D83333 


Mouse hepatitis virus 
genomic RNA for 
spike protein, partial 
cds 


1.8 


3876559 


rr^Tiu/i; jnwiijjiiy iu nuinirn — 
cyclin A/CDK2-associatd 
protein P19 (RNA polymerase 
elongation factor) 
(SW;SKP1_HUMAN); cDNA 
EST EMBL:T00114 comes 
from this gene; cDNA EST 
yk390fl 1.5 comes from this 
gene; cDNA EST yk402el 1.5 

CO... 

>gi|38772 1 6|gnl|PID|e 1 346850 
protein P19 (RNA polymerase 
elongation factor) gene; cDNA 
EST yk390fl 1.5 comes from 
this gene; cDNA EST 
yk402ell.5 co... 


3.3 


120 


AFO 16972 


Cervus elaphus 
REDDEER 
mitochondrial D- 
loop, complete 
sequence 


1.8 


3878057 


(Z99942) similar to von 
Willebrand factor type A 
domain; cDNA EST yk412d4.5 
comes from this gene; cDNA 
EST yk412d4.3 comes from this 
gene 


3.2 


121 


ABO 10741 


Oncorhynchus mykiss 
mRNA for rtSo.\24, 
complete cds 


1.8 


1730805 


RYPUlHhllCALil.UKJJ 
PROTEIN IN RPS3-PSD1 
INTERGENIC REGION 
>gi[2132762|pir||S63129 
probable membrane protein 
YNLl74w- yeast 
(Saccharomyces cerevisiae) 
>gi|1302152|gnl|PID|e23954S 
(Z71451)ORF YNL174w 
Saccharomyces cerevisiae] 


2.5 


122 


U32844 


Haemophilus 
influenzae Rd section 
159 of 163 of ihe 
complete genome 


1.8 


72S910 


A-TYPE INCLUSION 
PROTEIN (ATI) carnelpox 
virus >gi|62381 (X69774) 
84kDa A-type inclusion protein 
unidentified] 


1.9 




WO 01/02568 
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Nearest Neighbor (BlastN vs. Cenbank) 



SEQ 

ID 1 ACCESSION 



123 1 U18321 



DESCRIPTION 



Human ionizing 
radiation resistance 
conferring protein 
mRNA. complete cds. 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



P VALUE I ACCESSION 



1.8 



2133273 



124 | M28668 



125 | AFQ64553 



Human cystic fibrosis J 
mRNA, encoding a 
presumed 
transmembrane 
conductance regulator! 
(CFTR). > :: 
gb|Il 1500|I1 1500 
Sequence 1 from 
Patent US 5407796 



DESCRIPTION 



ribosomal protein YS7 homolog 
Emericella nidulans 



1.8 



filaggrin precursor - mouse 
90492 Kfragment) 



Ulragment) 

IpROBAiiLk PkOTEJN 



Mus musculus NSD1 
protein mRNA, 
complete cds 



1.8 



2501207 



DISULFIDE ISOMERASE P5 
PRECURSOR >gi|1065461 
(U40411) Similar to protein 
[disulfide-isomerase. 
[Caenorhabditis elegans] 



p value! 



126 | AB0O2314 



127 I L42096 



128 I M37278 



Human mRNA for 
KIAA0316gene, 
complete cds 



Homo sapiens 
(subclone 10_d2 from[ 
PIH21)DNA 
sequence. 



115131 



kbuULATOR V PROTEIN 
iBRLA (BRISTLE A PROTEIN), 
>gi|837l8|pir[|A28913 
[regulatory protein brIA - 
lEmericella nidulans >gi|168029 
|(M20631)brlA protein 



R.norvegicus renin 
gene, exons 1-9. 



metalloproteinase I (EC 3.4.24 
L8 I 2135624 |) ■ human 



1.8 



4050087 



(AF 109907) SI 64 [Homo 
sapiens] 



1.4 



0.87 



0.87 



0.84 



0.65 



129 [ ' X82879 



131 



Artificial sequences 
DNA for ART 2 
consensus 



1.8 



310929 



|(L13442)cysteine-rich extensin 
like protein-4 [Nicotiana 
Jtabacum] 



130 I D89729 



U7S076 



Homo sapiens mRNA 
for CRM 1 protein, 
complete cds 1.8 



(AJ0 10792) Muc5AC protein 
3559944 |[Mus musculus] 



0.52 



0.38 



Mus musculus 
sepiapterin reductase 
gene, exons I and 2 



1.8 



2984225 



(AE000766) enolase- 
phosphatase E-I [Aquifex 
laeolicusl 



WO 01/02568 
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Nearest Neighbor f BlastN vs. Gcn bank) 

seq" 

ip i accession 



132 



X52I33 



133 I M77830 



134 I AJ22415Q 



135 | AJ005518 



36 | AF002217 



DESCRIPTION 



Paramecium I68G 
gene for I68G 
surface protein 



Nearest Neighb or (BlastX vs. Non-Redundant P?me^) 



P VALUE | ACCESSION 



1.8 



1 15316 



Human desmoplakin I 
rrtRNA, complete cds 



Plasmodium berghei 
EF-1 alpha A-pene 



Mus muscuius 
somatostatin receptor 

gene, exonl and 5 
flanking regio n 



137 | AF039035 



^alstonia eutropha 
megaplasmid pHGl 
nitric oxide reductase 
(norB) gene, 
complete cds 
Caenorhabditis 



elegans cosmid 
C53A3 



DESCRIPTION 



COLLAGEN ALPHA 1( VIII) 

CHAIN PRECURSOR 
(ENDOTHELIAL 
COLLAGEN) 

>gi|105686|pir||S 15435 collagen 
alpha l(VrjI) chain precursor 



P VALUE 



1.8 



1397246 



1.8 



1353761 



1.8 



1326350 



1.8 



3393018 



1.8 



3S50I09 



'51944) coded tor by C 
elegans cDNA ykl 12f3.5; coded 
for by C elegans cDNA 
cm21d2; coded for by C. 
elegans cDNA CEESR07F; 
coded for by C. elegans cDNA 
ykl 12f3.3; coded for by C. 
elegans cDNA CEESR29F 
[Caenorhabditis ele gans I 



(U43 192) myosin II heavy chain 
[Naegleria fowleri] 



(U58748) similar to potential 
transmembrane domains in S. 
cerevisiae nulcear division 
RFT1 protein (SP:P38206) 



(AL031174) hypothetical 
protein 



(AL03338S) 3-oxoacyl-[acyl- 
carrier-protein]-synthase 



0.073 



le-04 



2e-05 



2e-08 



138 | M81769 



139 | Y1U06 



S.domesticus 
immunoglobulin 
rearranged gamma 
chain mRNA, VJC 
region, complete cds. 



P.pasturis PYC1 sene 



1.8 



30S0527 



(AL022600) putative mannose-1 
phosphate gaunyl transferase 
Schizosaccharomyces pombe] 



1.8 



1 175412 



HYPOTHETICAL 24.2 KD 
PROTEIN C13 A 11.03 IN 
CHROMOSOME I >gi|984224 
(Z54096) unknown 



3e-14 



140 1 U87803 



Human putative 
Ca2+/calmodu!in- 
dependent protein 
kinase kinase gene, 3* 
flanking region, 
partial sequence 



1.8 



2S2S280 



(AL0216S7) putative protein 
[Arabidopsis thaliana] 
>gi|2S32633|gnl|PID|e 1249651 
(AL02171 1) putative protein 
[Arabidppsis thaliana] 



3e-17 



5hq 



WO 01/02568 



PCT/USOO/18374 



Nearest Neighbor (BlastN vs. GenbanlO 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



SEQ 

K> | ACCESSION 



DESCRIPTION 



Plasmodium 



P VALUE | ACCESSION 



DESCRIPTION 



P VALUE 



falciparum 
chromosome 2, 
section 67 of 73 of 
the complete 
AE0014 30 sequence 



1.8 



I93I647 



(U95973) endomembrane 
rotein EMP70 precusor isi 



. J prec 

VFOlHfcTiC ' AL /5.5 KB 
PROTEIN C14A4.3 IN 
CHROMOSOME II 
►gi|3 874230|gnl|PED|e 135 1618 
protein (Swiss Prot accession 
number P38376); cDNA EST 
yk220el0.5 comes from this 
;ene 



olog 



2e-20 



142 I L 19708 



Rat N-memyl-D- 
jaspartate receptor 
(NMD AR1) gene, 
first exon. 



1.8 



1731181 



gene [Caenorhabditis elegans] 
(Z81103) predicted using 
Genefinder; cDNA EST 
yk303gl 1.5 comes from this 
gene; cDNA EST yk303gl 1.3 
comes from this gene 
[Caenorhabditis elegans] 



3e-21 



143 | Y1Q728 



P.schwarzi 
mitochondrial cytb 
gene, partial 



1.8 



3878644 



le-28 



Homo sapiens mRNA 
I for KIAA0293 gene, 

144 I AB006631 partial cds 



1.8 



4176500 



(AL031177) dJ889M15.3 (novel 
protein) 



7e-45 



iMus musculus 13 
(protein mRNA, 
JVFI06967 complete cds 



L7 



<NONE> 



• <NONE> 



<NONE> 



146 | AE001Q73 



lArchaeoglobus 
fulgidus section 34 of 
1 172 of the complete 
genome 



1.7 



<NONE> 



<NONE> 



<NONE> 



147 I U12977 



rseuaomonas 
llemoignei poly(3- 
hydroxybutyrate) 
depolymerase A 
precursor (phaZ5) 
Igene, complete cds, 
land gIyceroI-3- 
Iphosphate- 
dehydrogenase 
homology complete 
cds. 



1.7 



<NONE> 



<NONE> 



<NONE> 



_148 I M27038 



IMus musculus 

(SK/CamRk) 

germline IgK chain 
[gene. J 1-5 region. 



1.7 



<NONE> 



<NONE> 



<NONE> 



V 



-5 



WO 01/02568 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins* 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






H saniens HRF- 1 










149 


X74142 


mRNA for 
transcription factor 


1.7 


<NONE> 


<NONE> 


<NONE> 


150 


U40830 


streptococcus 
thcrmophilus DeoD 
gene, partial cds and 
EpsA, EpsB, EpsC, 
EpsD t EpsE, EpsF, 
EpsG, EpsH, EpsI, 
EpsJ, EpsK, EpsL, 
EpsM, Orf 14,9 
protein genes* 
complete cds 


1/7 


<NONE> 


<NONE> 


<NONE> 


151 


L29172 


Rabbit Ig germline 
gamma H-chain 
(allotype dl2,el5) C- 
region gene, 3' end. 


1.7 


<NONE> 


<NONE> 


<NONE> 


152 


M19045 


Human lysozyme 
mRNA, complete cds. 


1.7 


<NONE> 


<NONE> 


<NONE> 


153 


AE001 159 


Borrelia burgdorferi 
(section 45 of 70) of 


1 7 






<INUfNC> 


154 


LI 7027 


Plasmid pFdA (from 
Fremyella 
diplosiphon) DNA 
sequence, including 
unidentified cds and 

ctem Innn 


1.7 








155 


U12232 


Arabidopsis thaliana 
Columbia GTP 
binding protein beta 
subunit (AGBl) 
mRNA. complete cds. 


1.7 


<NONE> 


<NONE> 


<NONE> 


156 


D42056 


Arabidopsis thaliana 
ATPK6 mRNA for 
ribosomal-protein S6 
kinase homolog. 
:omplete cds 


1.7 


<NONE> 


<NONE> 


<NONE> 


157 


X98117 


^hizobium 

eguminosarum prsD, 
xsE, ORF3 genes 


1.7 


<NONE> 


<NONE> 


<NONE> 



*hUe> 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Pro 


teins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















158 


AF039084 


Spinacia oleracea 
heat shock 70 protein 
protein, complete cds 


1.7 


<NONE> 


<NONE> 




159 


Z 1265 1 


R.norvegicus gene for 
catechol 

methyl transferase 


1.7 


<NONE> 


<NONE> 


<NONE> 


160 


AF002968 


Fringilla coelebs 
mitochondrial control 
region* partial 
sequence 


1.7 


<NONE> 


<NONE> 


<NONE> 


161 


AE001160 


Borrelia burgdorferi 
(section 46 of 70) of 
the complete genome 


1.7 


<NONE> 


<NONE> 


<NONE> 


162 


U67553 


Methanococcus 
jannaschii section 95 
of 150 of the 
complete senome 


1.7 


<NONE> 


<NONE> 


<NONE> 


163 


M86247 


S.ruminantium 
plasmid pS23 DNA. 


1.7 


<NONE> 


<NONE> 


<NONE> 


164 


S74436 


oIL-8=interIeukin-8 
[sheep, spleen cells, 
mRNA. 1435 nt] 


1.7 


<NONE> 


<NONE> 


<NONE> 


165 


D12719 


Candida maltosa 
ALK7 (CYP52A10) 
and ALK8 complete 
cds 


1.7 


<NONE> 


<NONE> 


cNONE> 


166 


U02625 


Geotrichum 
candidum NRRL Y- 
553 lipase gene, 
partial cds. 


1.7 


321245 


230k bullous pemphigoid 
antiaenBPMl - mouse 


9.3 


167 


Z588S1 


H.sapiens CpG DNA, 
clone 1 14a4, reverse 
read cpsl 14a4.rtla . 


1.7 


1854675 


(U66298) bone morphogenetic 
Drotein-6 rRattus norvegicusl 


9.1 


168 


U43674 


Agrobacterium 
tumefaciens conjugal 
transfer region 1 
aenes 


1.7 


1352066 


LAKUh FKULlNb-KlCH 
PROTEIN BAT2 MHC class III 
histocompatibility antigen HLA- 
B -associated transcript 2 - 
human >gi| 179339 (M33509) 
HLA-B-associated transcript 2 
(BAT2) [Homo sapiens] 
>gi| 179345 (M33518) HLA-B- 
associated transcript 2 (BAT2) 
[Homo.sapiens] 


9.1 
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SEQ 
ID 



Nearest Neighbor (BlastN vs. Genbank) 
ACCESSION! DESCRIPTION 1 p VALUE 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



DESCRIPTION 



P VALUE 



PRO I UFUKFH i KINUCjhN 



169 



AL023827 



Caenorhabditis 
elegans cosmid 
Y12A6A, complete 
sequence 
[Caenorhabditis 
elegans 



1.7 



731440 



OXIDASE (FPU) yeast 
(Saccharomyces cerevisiae) 
>gi|603606 (U18778) Heml4p 
protoporphyrinogen oxidase 
[Saccharomyces cerevisiae] 
>gi|1403536|gn!|PID|e249333 
(Z71381) protoporphyrinogen 
oxidase [Saccharomyces 
cerevisiae] 



170 



X69662 



X,laevis mRNA for 
glutathione 
synthetase, large 
subunit 



1.7 



4038057 



(AC005897) hypothetical 
protein [Arabidopsis thaliana] 



171 



Z35824 



S.cerevisiae 
chromosome II 
reading frame ORF 
YBL063w 



1.7 



3021450 



(Y15515) prdl-a [Hydra 
vulgaris] 
JU 



172 



M65139 



Cowpea chlorotic 
mottle virus (CCMV) 
la protein gene, 
complete cds. 



1.7 



2506307 



CULLAUfcN ALPHA 1(X11) 
CHAIN PRECURSOR I (XII) 
chain - chicken 
>gi|222 8 1 1 |gnI|PID]d i 00 11 60 
gallus] 

>gi|2326442|gnl|PID|e39435 
(X61024) collagen type XII 
alpha I chain [Gallus gallus] 



PROTEIN IN ALPA-GABD 
INTERGENIC REGION (F87) 
>gi|I033124 (U36840) 
ORFJS7 [Escherichia coli] 
>gi|178S982 (AE00034S) orf, 
(hypothetical protein 



173 



X15065 



Drosophila distal BX 
C region (bithorax 
complex) pH189 5* 



region; 



.1.7 



1723625 
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. . L 


Nearest Neighbor (BlastN vs. Genbank) 


_ Nearest Neighbor (BlastX vs. Non-Redundant Protein.0 


SEC 
ID 


! 

ACCESSION 


sT | DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE, 


174 


Z46255 


S.cerevisiae 
chromosome VI 
lambda clone. 


1.7 


3875228 


(Z46792) similar to lethal(l) 
discs large- 1 tumor suppressor 
protein-like repeats; cDNA EST 
EMBL;D33495 comes from this 
gene; cDNA EST 
EMBL:D351 17 comes from this 
gene; cDNA EST 
EMBL:D36356 comes from this 
gene;cDNA EST EMB... 
>gi|3879984|gnl|PID|el35i767 
suppressor protein-like repeats; 
cDNA EST EMBL.D33495 
comes from this gene; cDNA 
EST EMBL:D35 1 17 comes 
from this gene; cDNA EST 
EMBL:D36356 comes from this 
gene; cDNA EST EMB... 


1 

6.7 1 


175 


U01066 


Human CD4 
promoter, partial 
sequence. 


1.7 


125448 


THYMIDINE KINASE 
saimiriine herpesvirus 1 (strain 
ll[Onc]) >2i|60341 


6.7 1 


176 


U34743 


Phalaenopsis sp. 
'hybrid SM9108* 
homeobox protein 
mRNA. complete cds 


1.7 


1022918 


(U38184) ATPase subunit 6 
Trypanosoma cruzi] 


6.7 I 


177 


U 14662 


Baboon herpesvirus 
HVP2 gB 

glycoprotein (UL27) 
gene, complete cds. 


1.7 


3218378 


(AL023862) hypothetical 
protein SC3F9.07 (Streptomyces 
coelicolorl 


6.7 J 


178 


AB017006 i 


PMS2L15 mRNA, 
partial cds 


1.7 


< 

1465855 1 


;U64859) glutamine-rich protein 
TTaenorhabditis elesans] 


6.7 J 


179 


3 
I 
i 
t 

U92651 c 


Jrassica oleracea var. 
wtrytis tonoplast 
ntrinsic protein 
>obTIP26-I mRNA, 
omplete cds 


1.7 


I 
( 
Y 

3023675 r 


DYNEIN HEAVY CHAIN, 
:YTOSOLIC (DYHC) dynein 
teavy chain 

Schizosaccharomyces pombe] 


6.6 


180 


I 

n 

AF000634 n 


-ytechinus variegatus 
otch homolog 
nRNA, complete cds 


1.7 


( 
g 

148574 s 


M58520) endo-l,4-beta- 
lucanase [Fibrobacter 
uccinogenes] 


6.6 1 
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Nearest 


Neighbor (BlastN vs. Genbank) | 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEC 
ID 


! 

ACCESSIOr 


4 DESCRIPTION 


P VALUE 1 


ACCESSION 


DESCRIPTION 


P VALUE 


181 


M92354 


Arabidopsis thaliana 
anthranilate synthase 
alpha subunit gene, 
complete cds. 


1 

1.7 1 


738308 


blue light photoreceptor 
[Arabidopsis thalianal 


6.5 


182 


AJ234856 


Hordeum vulgare 
genomic DNA 
fragment; clone 
MWG2234.rev 


1.7 


3142302 


• (AC00241 1) Strong similarity tc 
myosin heavy chain gb|Z34293 
from A. thaliana. [Arabidopsis 
thaliana] 


i 

6.5 


183 


U76827 


kJLd KtVJI ill J Uj 

parasiticus bird J33 
cytochrome b protein, 
partial cds 


1.7 


3413810 


(Y 17034) Bassoon [Mus 
musculus] 


5.4 


184 


U05211 


Saccharomyces 
cercvisiae Ttplp 
{ i irij gene, 
complete cds. 


1.7 


403173 


(L24492) lipoprotein 
[Rhodococcus erythropolis] 


4.9 


185 


AF076974 


Homo sapiens 
TRRAP protein 

/TDD AD\ —DMA 

complete cds 


1.7 I 


1170140 


PUTATIVE 

ENDOGLUCANASE TYPE K 
PRECURSOR (ENDO-1,4- 
BETA-GLUCANASE) 
(CELLULASE) 


4.1 


186 


AE000753 


Aquifex aeolicus 
section 85 of 109 of 
the complete genome 


1.7 1 


1 169357 


DNA ADENINE METHYLASE 
site-specific DNA- 
methyltransferase (adenine- 
specific) dam methylase gene 
product [Vibrio cholerae] 


4.0 


187 


AF005638 


Tupaia glis 
apolipoprotein AI 
?repropeptide 
mRNA t complete cds 


1.7 


3355682 - 


[AL031 124) putative secreted 
yase 


4.0 


188 


< 

M23090 c 


-luman germ line IgK 
:hain gene V3-region, 
:lone Humkv328h5 


1.7 1 


i 

2257483 


AB004534) pi003 
Schizosaccharomvces pombe] 


4.0 


189 


I 

C 

M24001 c 


tfink enteritis virus 
intigenic type 2 
apsid protein genes 
/PI and VP2, 
omplete cds. 


17 


i 
r 
I 

s 

2143504 4 


nyotonic dystrophy kinase - 
nouse (fragment) kinase, DM- 
anase {C-terminal, alternatively 
pliced. clone delta IUIIJV.V) 
mice, brain. Peptide Partial, 
74 aa] [Mus sp.) 


3.9 


190 


r 

X59964 f 


{.sapiens CST4 gene 
or Cystatin D 


17 | 


( 

1766075 C 


U37273) winged helix protein 
;WH-2 [Gallus gailus] 


3.1 
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Nearest 


Neighbor 'BlasiN vs. Genbank) 


_ Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSIOI> 


1 DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












HYPOTHETICAL 11.7 KD 




191 


X95276 


P.faJciparum 
complete gene map o 
piasiio-UKe UIN/\ (LK 
B) 


f 

1.7 


3219951 


PROTEIN C6B12.13IN 
CHROMOSOME I 
>gi|2330843|gnl|PID|e334047 
pombe] 


3.0 


192 


D84487 


Rat PMSG-induced 
ovarian mRNA, 
3'sequence, N10 


1.7 


173164 


(J02719) valyl-tRNA synthetase 
[Saccharomyces cerevisiae] 


2.3 


193 


L 14851 


Rait us norvegicus 
neurexin III-aJpha 
gene, complete cds. 


1.7 


3323586 


(AF060869) single-strand 
binding protein [Salmonella 
typhimurium] 


2.3 


194 


M97002 


Xenopus laevis/gilli 
hybrid pseudo-IgH 
chain gene, V region, 
clone LG7G342A. 


1.7 


211 8407 


MHC sex-limited protein - 
mouse (fragment) musculus] 


2.3 


195 


L07025 


tiucinui* inunngiensir 
delta-endotoxin 
(CryA(a)) gene, 5' 
end. > :: 

gb|I34520|J34520 
Sequence 1 from 
patent US 5596071 > 
:: gb|I39790|I39790 
Sequence 1 from 
patent US 5616495 > 

gb|AR00S487|AR008 
487 Sequence 1 from 
patent US 5753492 


1.7 


2496940 


HYPOTHETICAL 534 KD 
PROTEIN D1054.I3 IN 
CHROMOSOME V 
>gi|38753l6|gnl|PID|el344967 


1.8 


196 


S73149 


insulin-like growth 
factor 11 {intron 7 J 
"human, Genomic, 
1702 nt] 


1.7 


3327038 


(AB014512) KIAA0612 protein 
Homo sapiens] 


1.8 


197 


J 
I 

i 

D86990 c 


•iuman (lambda) 
3NA for 

mmunoglobulin light 
hain 


1.7 


] 
< 

< 

j 
/ 

1 

494367 < 


Fv Fragment (Murine Sel55-4) 
Complex With The 
rrisaccharide: Alpha- D- 
3alactose(l-2)[alpha-D- 
\bequose(I-3)]alpha- D- 
vlannose (Pl-Ome) (Part Of 
rhe Cell-Surface Carbohydrate 
}f Pathogenic Salmonella) 


1.8 
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Nearest Neighbor ( BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 




Plasmid pFdA (from 










198 


L17027 


Fremyella 
diplosiphon) DNA 
sequence, including 
unidentified cds and 
stem loop. 


1.7 


1082702 


poliovirus receptor-related 
protein - human 


1.4 


199 


AL022273 


Caenorhabdius 
elegans cosmid 
H22D14, complete 
sequence 
[Caenorhabditis 
elegans] 


1.7 


3924605 


(AF069442) putative inhibitor 
of apoptosis [Arabidopsis 
thai i an a] 


1.4 


200 


U89926 


Drosophila 
melanogaster cut 
gene, partial sequence 


1.7 


2245100 


(Z97343) DNA-binding protein 
homolog 


1.3 


201 


Z25749 


H. sapiens gene for 
ribosomal protein S7 


1.7 


2493459 


PROTEIN KINASE C 
SUBSTRATE. 60.1 KD 
PROTEIN, HEAVY CHAIN 
(PKCSH) (80K-H PROTEIN) 
>2i|1215746 


1.1 


202 


U59841 


Fundulus heteroclitus 
lactate dehydrogenase 
B 


1.7 


3005587 


(AF048977) Ser/Arg-related 
nuclear matrix protein [Homo 
sapiens] 


0.S2 


203 


X55763, 


Rabbit mRNA for 
smooth muscle 
calcium channel 
blocker (CaCB) 
receptor 


1.7 


3883128 


(AF082302) arabinogalac tan- 
protein [Arabidopsis thaliana] 


0.82 


204 


Z75528 


Caenorhabditis 
elegans cosmid 
C18B12A, complete 
sequence 
[Caenorhabditis 
elesans] 


1.7 


940397 


(D 10123) core [Hepatitis C 
virus] 


O.SO 


205 


U50912 


Human XIST gene, 
poly purine- 
pyrimidine repeat 
region 


1.7 


2338027 


(AF005370) large tegument 
protein [Alcelaphine herpesvirus 
11 


0.59 


206 


X12817 


Ovis aries beta- 
lactoalobulin gene 


1.7 


987050 


(X65335) lacZ gene product 
[unidentified cloning vector] 


0.45 


207 


AF0044I9 


Homo sapiens 
troponin T (TNNT2) 
sene, exon 13 


1.7 


2996364 


(AF053947) unknown [Yersinia 
pestis] >si|3S83090 


0.22 


208 


L43643 


Gallus domesticus 
DNA microsatellite 
marker MCW119 


1.7 


464S96 


TRANS DUCIN-LIKE 
ENHANCER PROTEIN 1 
enhancer-of-split homolog TLE- 
1 - human >gi|307510 


o.:o 



3b V 
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Nearest Neiehbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


pen 

SbQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















209 


273278 


S.cerevisiae 
chromosome XII 
reading frame ORF 
YLRl06c 


1.7 


1351657 


HYFU lHtl iCAL 121.9 KJD 
PROTEIN C30D11.04C IN 
CHROMOSOME I 
>gi|2130411|pir||S62562 
hypothetical protein 
SPAC30D1 1.4c - Fission yeast 
nuclear pore complex protein 
[Schizosaccharomyces pombe] 


0.20 


210 


M22345 


Mouse endogenous 
provirus gag, pol, and 
env region DNA. 


1.7 


2444455 


(AF020765) hypothetical 
protein [Myxococcus xanthus] 


0.12 


211 


AE000360 


Escherichia coli K-12 
MG1655 section 250 
of 400 of the 
complete genome 


1.7 


2736361 


(AF039038) No definition line 
found [Caenorhabditis elegans] 


0.12 


212 


AB020692 


Homo sapiens mRNA 
for KIAA0885 
protein, complete cds 


1.7 


2605924 


(AF029726) histidine kinase C 
[Dictyosteltum discoideum] 


0.094 


213 


S69429 


testis-determining 
gene/SRY homolog 
[Sminthopsis 
macroura=s tri ped- 
faced dunnarts, 
Genomic, 855 nt] 


1.7 


2499016 


TONB PROTEIN >gi|1666536 
(U23764) TonB [Pseudomonas 
aeruginosa] 


0.092 


214 


S69429 


testis-determining 
gene/SRY homolog 
[Sminthopsis 
macroura=striped- 
faced dunnarts, 
Genomic, 855 ntl 


1-7 


2499016 


TONB PROTEIN >gi| 1666536 
(U23764) TonB [Pseudomonas 
aeruginosa] 


0.088 


215 


U67205 


Mus musculus ACF7 
neural isoform 3 
(mACF7) mRNA, 
partial cds 


1.7 


2047349 


(AF000198) weak similarity to 
HSP90 [Caenorhabditis elegans] 


0.052 


216 


X98I88 


Artificial DNA 
sequence for 
mammalian lambda- 
neo minichromosome, 
1400 bp 


1.7 


2493779 


PUTATIVE CUTICLE 
COLLAGEN C09G5.6 
collagen; cDNA EST yk244c3.5 
comes from this gene; cDNA 
EST yk244c3.3 comes from this 
gene [Caenorhabditis elegans] 


0.042 


217 


U70139 


Mus musculus 
putative CCR4 
protein mRNA. 
partial cds 


1.7 


2252630 


(U95973) hypothetical protein 
[Arabidopsis thaliana] 


0.041 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


Nearesi 
ACCESSICtf 


rNeignoor (oiastiN vs. 
< DESCRIPTION 


3enbank) 
P VALUE 


Nearest Neigh 

ACCESSION 


bor (BlastX vs. Non-Redundant P 
DESCRIPTION 


roteins) 
P VALUE 


218 1 L38808 


Homo sapiens alpha- 
1 type V collagen 
(COL5AI)gene.5- 
flank and exon 1. 


1.7 


2895760 


(AF045246) universal minicircl 

S£GUCnC£ hinHino nrnfAi r% 

tlVMUWilVW VlllUllllL UIvLClll 

minicircle sequence binding 
protein [Crithidia fasciculata] 


e 

0.039 


219 I Z72151 


B.napus mRNA for 
AMP- binding protein 


1.7 


190475 


fK.02 j/fit *»nlivnrv nrnlinp i"i/»H 

^iVLj / u; otiiivuj/ jjiuiinc-ncn 
protein 1 [Homo sapiens] 


0.011 


220 1 X94I52 


R.norvegicus mRNA 
for cysteine sulflnate 
decarboxylase 


1.7 


• 2136212 


synapsin lib - human 

>gi| 1594277 (U40215) synapsin 

lib [Homo sapiens] 


0.008 


221 L20255 


Mouse stathmin gene 
sequence. 


1.7 ' 


2317934 


(U97553) unknown fmnrin** 

herpesvirus 68] 


0.006 


222 L 13600 


Rattus norvegicus 
glycine transporter 
mRNA, complete cds. 


1.7 


726403 


(U23 175) similar to anion 
exchanop nrnt^in 

Caenorhabditis elegans] 


0.003 


223 1 AJ224150 


Plasmodium berghei 
EF-1 alpha A-gene 


1.7 


2072290 


(U95094) XL-INCENP 


0.001 


224 1 S80642 


butyrophilin [mice, 
lactating mammary 
gland, mRNA Partial, 
3193 nt] 


1.7 


2695746 


(AJ2230lO)Pmt2 

o^iii^ubULtn^urnyccs pomoci 


9e-04 


225 J M22363 


C.elegans unc-86 
gene encoding two 
alternative proteins, 
complete cds. 


1.7 


2224683 


(AB002369) KIAA0371 [Homo 
sapiens 1 


Ie-04 


226 X92123 


M.musculus cgt gene 
sxon 1 


1.7 


3874232 


(Z49909) similar to Prokaryotic 
ri bo nuclease PH 
Caenorhabditis elegans] 


3e-05 


I ] 

1 ( 

227 1 ABO 16000 r 


[pomoea nil PKn2 
knotted- 1 ike gene) 
tiRNA, complete cds 


1.7 


( 


AF000422) TTF-I interacting 
reptide 5 [Homo sapiens] 


le-05 


1 E 

228 | D14133 s 


t ovine mRNA for 
ynaptocanalin I 


1.7 


2183083 | 

I 
I 
F 

y 
z 

c 

3925277 [( 


ALu3264ij similar to 
Jncharacterized protein family 
JPF0034, Double-stranded 
IN A binding motif; cDNA EST 
k489b3.5 comes from this 
ene; cDNA EST yk439g7.5 
omes from this gene 
Caenorhabditis eleaans] 


2e-06 



WO 01/02568 



PCT/US00/18374 





Nearest Neighbor (BlastN vs. Gcnbarik) 


Nearest Neiehbor (BlastX vs. Non-Redundant Proteins} 


SEQ 
fD 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















229 


L01991 


Mus musculus TAFG 
l-like neuronal 
glycoprotein (PCS) 
mRNA, complete cds 


1.7 


3006139 


(AJL022299) hypothetical 
protein 


4e-07 


230 


X63016 


Tomato yellow leaf 
curl virus Thailand 
isolate complete 
genome (TYLCV-TH 
B-DNA) 


1,7 


3643608 


(AC005395) hypothetical 
protein [Arabidopsis thaliana] 


le-07 


231 


222802 


H.sapiens 

microsatellite repeat. 

> :: 

gb|G34562|G34562 
human STS SHGC- 
51834 


L7 


100210 


extensin precursor (clone Tom L 
4) - tomato esculentum] 


4e-09 


232 


K02765 


Human complement 
component C3 
mRNA, alpha and 
beta subunits, 
complete cds. 


1.7 


2984320 


(AE000773) acetoin utilization 
protein [Aquifex aeolicus] 


le-09 


oil 


Z74818 


S.cerevisiae 
chromosome XV 
reading frame ORF 
YOL076w 


1.7 


3873700 


(Z731U2) predicted using 
Genefinder; Similarity to 
Bacillus subtilis DNAJ protein 
gene; cDNA EST 
EMBL:C 12520 comes from this 
gene; cDNA EST 
EMBL:D71409 comes from this 
ge... 


7e-U 


234 


D21871 


Pig mRNA for thimet 
oligopeptidase 


1.7 


2632098 


(Y 155 13) Prodos protein 
Drosophila melanoeaster] 


8e-I3 


235 


( 
< 

Y 14344 < 


[Jail us gallus gene 
encoding neurofascin, 
:xons 9,10,11 & 12 


1.7 


] 
I 

3876421 < 


(^81070) cDNA LSI 1 
EMBL:C 12730 comes from this 
gene; cDNA EST yk200b6.5 
comes from this gene; cDNA 
EST yk349al2.5 comes from 
this gene [Caenorhabditis 
slegans] 


3e-14 


236 


c 
1 

Z73608 1 


S.cerevisiae 
chromosome XVI 
■eading frame ORF 
fPL252c 


1.7 


( 
I 

1439663 « 


U64605)C05D9.6 gene 
product [Caenorhabditis 
Megans] 


6e-18 



WO 01/02568 



PCT/US00/18374 





Nearest Neighbor (BiastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
LD 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












OLIGOSaCCHARYL 




237 


AG000518 


Homo sapiens 
genomic DNA, 2lq 
region, clone: 
T171N23 


1.7 


1174468 


TRANSFERASE STT3 
SUBUNIT HOMOLOG 
>gi|529357 (U 130 1 9) No 
definition line found 
[Caenorhabditis elegans] 


6e-18 


238 


D17716 


Human mRNA for N- 
acetylgiucosaminyltra 
nsferase V, complete 
cds 


1.7 


961446 


(D63877) KIAA0157 gene 
product is novel. 


5e-I9 


239 


AF102512 


Lheilodactylus 
vittatus country USA: 
Midway Island 
cytochrome c oxidase 
subunit I gene, 
mitochondrial gene 
encoding 
mitochondrial 
protein, partial cds 


1.7 


1572756 


(U70848) C43G2.1 gene 
product [Caenorhabditis 
elegans] 


5e-40 


240 


L30I07 


Rattus norvegicus 
liver-specific 
transporter gene, 
promoter region. 


1.7 


4176443 


(AL022238)dJ1042K10.4 
(novel protein) 


3e-49 


241 


X91220 


H. sapiens mRNA for 
Na-Cl electroneutral 
thiazide-scnsitive 
cotransporter 


1.7 


3478637 


(AC005546)R29425_I [Homo 
sapiens] 


6e-54 


242 


U97I46 


Rattus norvegicus 
calcium-independent 
phospholipase A2 
mRNA, complete cds 


1.6 


<NONE> 


<NONE> 


<NONE> 


243 


Z48508 


Pea seed- borne 
mosaic vims RNA for 
coat protein and 
polymerase (partial) 


1.6 


<NONE> 


<NONE> 


<NONE> 


244 


Ml 8349 


Rat leukocyte 
common antigen (L- 
CA) gene, exons I 
throush 5. 


1.6 


<NONE> 


<NONE> 


<NONE> 


245 


M13158 


Yeast (S.pombe) 
cdc25+ gene (mitosis 
initiation), complete 
cds. 


1.6 


<NONE> 


<NONE> 


<NONE> 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Mycoplasma 










246 


U39712 


genital ium section 34 
of 51 of the complete 
genome 


1.6 


<NONE> 


<NONE> 


<NONE> 


247 


M17922 


Mouse Murine 
urokinase-type 
plasminogen activator 
protein gene, 
complete cds. 


1.6 


3875750 


(ZJJ1499) predicted using 
Genefinder; cDNA EST 
yk410e3.3 comes from this 
gene;cDNA EST yk410e3.5 
comes from this gene 
[Caenorhabditis elegans] 


8.0 


248 


M89986 


Human polymorphic 
loci in Xq28. 


1.6 


3261710 


(Z84724) psd [Mycobacterium 
tuberculosis] 


6.4 


249 


M89986 


Human polymorphic 
loci in Xq28. 


1.6 


2143805 


inositol-polyphosphate 4- 
phosphatase - rat 


6.2 


250 


U68725 


Rattus norvegicus 
Deleted in colorectal 
Cancer 


1.6 


1256804 


(U51449) RING3 protein 
[Xenopus laevis] 


5.8 


251 


X95199 


P.platessa GSTA. 
GSTA1, GSTA2, and 
PPTN genes 


1.6 


3915113 


MALEYLACETATE 
REDUCTASE Pseudomonas 
cepacia >gi|643636 (U19883) 
malcylacetate reductase 
[Burkholderia cepacia] 


4.9 


252 


Y09103 


D.melanogaster 
RPA1 gene 


1.6 


3916021 


HYPOTHETICAL 5 1 Kb 
PROTEIN IN COB INTRON 
>gi|2654230|gnl|PED|el 192341 
(X02819) unidentified reading 
frame [Schizosaccharomyces 
pombe] 


4.8 


253 


214078 


T.aestivum 
mitochondrion fMet, 
18S r 5S repeat unit 
DNA 


1.6 


2501668 


DYSTROPHIN- RELATED 
PROTEIN 2 sapiens] 


3.6 


254 


AB002314 


Human mRNA for 
KIAA03i6gene T 
complete cds 


1.6 


130997 


REPETITIVE PROLINE-RICH 
CELL WALL PROTEIN 1 
PRECURSOR 

>gi|8l809|pir||A29324 proline- 
rich protein precursor - soybean 
>gi| 170049 (J02746) proline- 
rich protein [Glvcine max] 


2.8 


255 


M21488 


Human muscle 
creatine kinase gene 
(CKMM). exon 2. 


1.6 


J 19399 


ENV POLYPROTEIN 
PRECURSOR (COAT 
POLYPROTEIN) [CONTAINS: 
COAT PROTEIN GP62; COAT 
PROTEIN GP40] 


2.2 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















256 


AE001164 


Borrelia burgdorferi 
(section 50 of 70) of 
the complete genome 


1.6 


4050089 


(AF109907) hypothetical 
protein [Homo sapiensl 


1.5 


257 


X61757 


M.musculus 
rearranged T-cell 
receptor beta variable 
region (Vbl7a) 


1.6 


3377766 


(AF080090) semaphorin rv 
isoform b [Mus musculus] 


1.2 


258 


M15346 


T.cruzi tandemly 
repeated gene 
encoding an 85 kDa 
antigen with 
homology to heat 
shock proteins. 


1.6 


2804437 


(AF043695) similar to zinc 
metallopro tease family of 
peptidases [Caenorhabditis 
elegans] 


0.41 


259 


L39018 


Rattus norvegicus 
sodium channel 
protein 6 (SCP6) 
mRNA, complete cds 


1.6 


2920535 


(AF01 8081) type XVIII 
collagen (Homo sapiens] 


0.037 


260 


M29483 


Human leukocyte 
adhesion protein 
p 150,95 alpha subunit 
gene, exons 7-15. 


1.6 


1840045 


(U49082) transporter protein 
[Homo sapiens] 


2e-09 


261 


L06844 


Aspergillus niger beta 
D-fructofuranosidase 
(sucl) gene, one 
exon. 


1.6 


4206210 


(AF071527) putative calcium 
channel [Arabidopsis thaliana] 


9e-10 


262 


M10946 


Chicken aldolase B 
gene, complete cds, 
clones larnbda- 
C(l 1.1.4). 


1.6 


2746775 


(AF040640) similar to peptidase 
family C 19 (ubiquitin carboxyl- 
terminal hydrolase) 
[Caenorhabditis elegans] 


le-31 


263 


X07881 


Human gene PRB3L 
for proline-rich 
protein Gl 


1.5 


<NONE> 


<NONE> 


<NONE> 


264 


U22260 


Nicotiana tabacum 
UMP synthase (pyr5- 
6) mRNA. partial cds 


1.5 


3880923 


(Z99271) similar to Reverse 
transcriptase comes from this 
gene [Caenorhabditis elegans] 


0.50 


265 


U76759 


Mus musculus 
nuclear protein 
NIP45 mRNA. 
complete cds 


1.4 


1330394 


(U58761) C0IF1.6 gene product 
[Caenorhabditis elegans] 


8.9 
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Nearest Neighbor fBlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












FO'l ASMbM- 




266 


AF076470 


Rice tungro 
bacilli form virus 
Serdang strain, 
complete senome 


1.4 


1703461 


TRANUURTMJAlPASfc 
BETA CHAIN (PROTON 
PUMP) (GASTRIC H+/K+ 
ATPASE BETA SUB UNIT) 
3.6.1.36) beta chain - human 
>gi| 184105 (M75U0) H.K- 
ATPase beta subunit [Homo 
sapiens] 


8.9 


267 


X64659 


C.jacchus interferon 
gene for interferon 
gamma 


1.4 


' 1486485 


(U28832) US10[Gallid 
herpesvirus 1] >gi|1486497 


6.8 


268 


U 11825 


Schistosoma 
japonicum structural 
muscle protein 
paramyosin mRNA, 
complete cds. 


0.88 


<NONE> 


<NONE> 


<NONE> 


269 


D84278 


Human DNA for 
CD38. exon 1 


0.68 


3766363 


(AL031907) hypothetical serine 
rich protein 

[Schizosaccharomyces pombe] 


3.0 


270 


M59755 


Bovine lens aldose 
reductase 

pscudogene, 3* end. 


0.67 


<NONE> 


<NONE> 


<NONE> 


271 


M81758 


Homo sapiens 
skeletal muscle 
voltage-dependent 
sodium channel alpha 
subunit (SkMl) 
mRNA. complete cds. 


0.65 


2437819 


(Z86105) 1,4-beta-glucanase 
[Anaerocellum thermophilum] 


3.6 


272 


L0I965 


Human type IV 
sodium channel alpha 
polypeptide 


0.64 


2437819 


(Z86105) 1,4-beta-glucanase 
[Anaerocellum thermophilum] 


3.5 


273 


U90122 


Danio rerio bone 
morphogenetic 
protein-4 (bmp4) 
mRNA, partial cds 


0.63 


2983532 


(AE000720) formate 
dehydrogenase alpha subunit 
(Aquifex aeolicus] 


7.9 


274 


L41624 


Hylobates lar mucin 
(MUC1) gene, exons 
1-6. 


0.63 


1517808 


(D79215) FGF-10 [Rattus 
norvegicus] 


0.91 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neiehbor (BlastX vs. Non-Redundant Protein^ 


SEQ 
ID 


ACCESSION 


\ DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












(UO/y^o) coded tor by L, 




275 


AF030881 


Fugu rubripes sushi 
retrotransposon gag 
polyprotein (gag) and 
pol polyprotein (pol) 
genes, complete cds 


0.63 


1519696 


elegans cDNA ykl26ri).5; codec 
for by C. elegans cDNA 
ykl59h6.3; coded for by C. 
elegans cDNA ykl26f9.3; coded 
for by C elegans cDNA 
ykl59h6.5 [Caenorhabditis 
elegans] 


0.38 


276 


U52909 


Arabidopsis thaliana 
Ul snRNP 70K 
protein gene, 
complete cds 


0.62 


' <NONE> 


<NONE> 


<NONE> 


277 


AF008192 


Homo sapiens 
putative GR6 protein 
(GR6) mRNA, 
complete cds 


0.62 


3800934 


(AF1 00655) contains similarity 
to ser/thr protein kinases 
[Caenorhabditis elegans] 


9.7 


278 


U17081 


Human fatty acid 
binding protein 
(FABP3) gene, 
complete cds 


0.62 


3617848 


(AF049709) tyrosylprotein 
suIfotransferase-A; TPST-A 


7.7 


279 


AB018340 


Homo sapiens mRNA 
for KIAA0797 
protein, partial cds 


0.62 


424044 


VPS protein - porcine rotavirus 
>gi|61355 


7.7 


280 


Y00093 


H.sapiens mRNA for 
leukocyte adhesion 
glycoprotein p 150,95 


0.62 


1054945 


(U38621) polyprotein (Tobacco 
vein mottlina virus] 


4.5 


281 


M63138 


Human cathepsin D 
[catD) gene, exons 7, 
8, and 9. 


0.62 


136810 


GLYCOPROTEIN M 
>gi|73791|pir||WMBE51 UL10 
protein - human herpesvirus 1 1- 
473) [Human herpesvirus 1] 
>gi|221732|snI|PID|dl00213l 


3.5 


282 


X76056 i 


N. sylvestris DNA for 
spacer region 
between 25S and I8S 
ribosomal RNA eenes 


0.62 


2661176 


[U76671) putative cds 
llhodobaecer sphaeroides] 


2.0 


283 


] 

X74501 i 


3.taurus mRNA for 
\CTH receptor 


0.62 


< 

4249552 | 


:AB001075) galectin-2 related 
protein 


2.0 


284 


I 

s 

M57634 c 


*at FI-ATPase beta 
ubunit mRNA, 3" 
nd. 


0.62 


t 
t 

t 

2119692 t 


rans forming growth factor- beta 
ype III receptor • chicken 
>gi|5 11843 (L01121) 
ransformins growth factor-beta 
ype III receptor [Gallus gallus] 


1.5 



3^0 
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Nearest 


Neighbor (BlastN vs. Genbank) 


Nearest Neighbor ( Blast X vs Nnn-RpH.inHnnr Prnfplnci 


SEQ 
ID 


AccEssior 


* DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












ASPAK1 YLMSFAKAGIN YL 




285 


Y 15724 


Homo sapiens 
SERCA3 gene, exons 
1-7 (and Joined CDS) 


0.62 


2498164 


BblA-HYUROXYLAST 
(ASPARTATE BETA- 
HYDROXYLASE) (ASP BET* 
HYDROXYLASE) (PEPTIDE- 
ASPARTATE BETA- 
DIOXYGENASE) beta- 
dioxygenase (EC 1.14.11.16)- 
bovine >gi| 162694 taurus] 


0.52 


286 


AL010142 


Plasmodium 
falciparum UNA mmm 
SEQUENCING IN 
PROGRESS *** 
from contig 3-72, 
complete sequence 


0.62 


3183206 


HYPOTHETICAL PROTEIN 
KIAA0009 sapiens] 


4e-07 


287 


AB008160 


Mus musculus Stat3 
gene, 5'-flanking 
region and exon I 
partial sequence 


0.62 


466097 


HYPOTHJL IIL'AL 6375TCD 

PROTEIN ZK353. 1 IN 
CHROMOSOME III 
>gi|1078903|pir||S44654 
ZK353.1 protein - 
Caenorhabditis elegans 
>gi|289757(L15313) putative 
[Caenorhabditis elegans] 


le-35 


288 


ABO 18795 


Halomonas marina 
gene for alginate 
lyase, complete cds 


0.62 


3877493 


(Z.4^8j) similar to A 1 Pases 
associated with various cellular 
activities (AAA); cDNA EST 
EMBL:214623 comes from this 
gene; cDNA EST 
EMBL:D75O90 comes from this 
gene; cDNA EST 
EMBL:D72255 comes from this 
gene; cDNA EST yk200e4... 


3e-46 


289 


< 
< 

Z69906 | 


Human DNA 
sequence from 
?osmid E141E2, on 
:hromosome 22, 
:omplete sequence 
Homo sapiens] 


0.61 


<NONE> 


<NONE> 


<NONE> 


290 


I 
t 

U 13259 r 


-iuman clone CIITA- 
i MHC class II 
ransactivator CUT A 
nRNA, complete cds. 


0.61 


( 

1483567 


X79983) viral proteinase 
Pseudorabies virus] 


9.8 


291 


t 
F 

X98890 t 


>. tuberosum mRNA 

or inorganic 

hosphate 

rans porter, StPTl 


0.61 


( 

475724 f 


U0S884) protein VIII precursor 
Bovine adenovirus type 3] 


7.6 
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SEQ 
ID 


Nearest 

\ 

ACCESSION 


Neighbor fBlasiN vs. < 
* DESCRIPTION 


Genbank) 
P VALUE 


Nearest Neiah 
ACCESSION 


t>or (BlastX vs» Non-Redundant P 
DESCRIPTION 


P VALUE 


292 


U70825 


Raims norvegicus 5- 
oxo-L-prolinase 
mRNA. complete cds 


0.61 


733543 


(U23448) similar to genome 
poly protein 

(SP:POLG_B VD VN. P 1 97 1 1 ) ; 
alternative splicing to C04A2.7a 


4.4 


293 


L81667 


Homo sapiens 
(subclone 2_a9 from 
Pi HdQ} DMA 

sequence 


0.61 


2565087 


(U80759) CAGH4 alternate 
open reading frame [Homo 
sapiens] 


3.3 


294 


AF 000760 

1 - \J\J\J t \J\J 


Aquifex aeolicus 
section 92 of 109 of 
the complete genome 


U.ol 


2811092 


HOMEOBOX PROTEIN HOX- 
A3 (HOX-1.5) homeobox- 
containing transcription factor 
Mus musculus] 


2.6 


295 


U58512 


Mus musculus Rho- 
associated, coiled- 

1 fnrm inn nmF^in 

uuii iuiiiiing prutcin 

kinase pi 60 ROCK-1 
mRNA, complete cds 


0.61 


295671 


(LI 1275) selected as a weak 
suppressor of a mutant of the 
subunit AC40 of DNA 
dependant RNA polymerase I 
and III 


1.5 


296 


U27459 


Human origin. 

recnonition cnmnlpY 
protein 2 homolog 
hORC2L mRNA, 
complete cds 


0.61 


200285 


(M97900) putative open reading 
frame [Mus musculus] 


0.66 


297 


L36680 


PlClim Clhuiim Q 

adenosylmethionine 
synthase mRNA, 3* 
end. 


0.61 


2285790 


(AB0O2086) p47 [Rattus 
norvegicus] 


4e-l2 


298 


AE000673 


Aquifex aeolicus 
the complete genome 


0.61 


3395782 


(AF058446) histone 
macroH2AI.2 [Gallus sallus] 


6e-27 


299 


AF086310 ( 


Homo sapiens full 
ength insert cDNA 
:!one ZD51F08 


0.61 


3646450 


(AL031603) conserved 
hypothetical protein. 
Schizosaccharomyces pombe] 


8e-29 


300 


i 
i 

c 

AJ009675 r 


<\grotis ipsilon 
nRNA for 3-hydroxy- 
i-methylglutaryl 
oenzyme A 
cductase 


0.61 


( 
i 

s 

( 

4176370 s 


AC005058) similar to calcium- 
ndependent phospholipase A2; 
imilar to AC004392 
PID:g3367519) [Homo 
apiens] 


2e-73 


301 


i 

c 
c 
c 

AC005577 |[ 


iomo sapiens 
hromosome 19. 
osmid F18382B, 
entromeric end, 
omplete sequence 
Homo sapiens] 


0.60 


<NONE> 


<NONE> 


<NONT> 
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SEC 
ID 


Nearest 

> 

ACCESSIO* 


Neighbor CBIastN vs. < 

4 DESCRIPTION 
Candida albicans 


Gen bank) 1 Nearest Neieh 
P VALUE | ACCESSION 


bor (BlastX vs. Non-Redundant P 
DESCRIPTION 


roteins) 
P VALUE 


302 




topoisomerase type I 
(C ATOP 1) gene, 
complete cds 


0.60 1 <NONE> 


<NONE> 


<NONE> 


303 


JO 1390 


Emericella nidulans 
mtDNA between 
h2/h5 and bh2/b2 
junctions, genes for 
ATPase subunit 6, 
cytochrome oxidase 
subunit 3, seven, 
unidentified proteins, 
twentyfour tRNA's 
and L-rRNA. 


0.60 1 <NONE> 


<NONE> 


<NONE> 


304 


r 1 1 no 


Plasmodium 
falciparum RNA 
polymerase I gene, 
complete cds. 


0.60 1 <NONE> 


<NONE> 


<NONE> 


305 


ZS1079 


Caenorhabditis 
elegans cos mid 
F39H11, complete 
sequence 
[Caenorhabditis 
elegans] 


0.60 <NONE> 


<NONE> 


<NONE> 


306 


249627 


S.cerevisiae 
chromosome X 
reading frame ORF 
YJR127c 


0.60 1 1 1 8751 


MAJOR DNA-BINDlNCj 
PROTEIN herpesvirus I (strain 
ll)>gi|60327 (X64346) major 
ssDNA-binding protein 
Saimiriine herpesvirus 2] 


9.6 


307 


U949H 


Rattus norvegicus H- 

i rase aipna z 
gene, alternatively 
spliced products and 
partial cds 


1 

0.60 1 2213862 


[AF003086) PfSNF2L 
Plasmodium falciparum] 


7.4 


308 


J 
( 

U67476 ( 


VIeihanococcus 
annaschii section 18 
)f 150 of the 
romplete genome 


( 

0.60 1749688 


D89240) unnamed protein 
jroduct 


5.7 


309 


I 

j 

c 

U67513 c 


vlethanococcus 
annaschii section 55 
>M 50 of the 
omplete genome 


( 

0.60 3327421 r 


U97068) zonadhesin [Mus 
nusculus] 


4.3 


310 


I 
1 

U57817 c 


Jaemophilus ducreyi 
poprotein gene, 
omplete cds 


( 
h 

0.60 | 4008577 f 


AL034491) conserved 
ypothetical protein 
Schizosaccharomyces pombe] 


2.5 
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SEC 
ID 


Nearest 

I 

ACCESSIOr 


Neighbor (BlastN vs. 
DESCRIPTION 


Genbank) 1 
P VALUE | 


Nearest Neieh 
ACCESSION 


bor (BlastX vs. Non-Redundant P 
DESCRIPTION 


P VALUE 


311 


X80700 


H.sapiens GI7 gene 


0.60 1 


422541 


_ probable protein-tyrosine kinase 
(EC 2.7.1.112) RTK - Pacific 
electric ray >gi|290858 


1.5 


312 


L42167 


Mus muscuius (clone 
R24) rds gene, partia 
cds 


1 

0.60 I 


4220848 


(AF033823) moira [Drosophila 
melanogaster] 


0.51 


313 


U54777 


Human hMSJ-ffi 
mRNA. complete cds 


0.60 I 


2665637 


(AF031087) mismatch repair 
protein MSH6 [Mus muscuius] 


5e-07 


314 


D86985 


Human mRNA for 
KIAA0232 gene, 

mmnlpff* rrlc 


U.oU | 


1938462 


(U97006) No definition line 
found [Caenorhabditis elesans] 


2e-07 


315 


D43964 


R*it Itv^r ml? NT A fnr 
f\ul 1IVCI IILTvl t /\ ior 

Kan-U complete cds 


0.60 I 


1280135 


(U^J76) coded for by C. 1 
elegans cDNA cm21e6: coded 
for by C. elegans cDNA 
cm01e2; similar to melibiose 
carrier protein 

(thiomethylgalactoside permease 
ID 


5e-15 


316 


U49058 


Rattus norvegicus 
CTD-binding SR-like 
protein rA4 rnRNA, 


U.oU 1 


2145091 


(U3750O) RNA polymerase II 
largest subunit [Mus muscuius] 


le-19 


317 


X84388 


U.ruddi 

mitochondrial 12S 
ribosomal RNA 


060 1 


3874247 


(Z70205) predicted using 
Genefinder 


2e-37 


318 


AF 125447 


Caenorhahd i ri <: 
elegans cosmid 
Y14HI2B 


0.59 


<NONE> 


<NONE> 


<NONE> 


319 


U20189 i 


Hyoscyamus muticus 
:lone cVS2 
vetispiradiene 
iynthase mRNA, 
partial cds. 


0.59 


<NONE> 


<NONE> 


<NONE> 


320 


1 

/ 
s 

M63962 c 


-luman gastric H,K- 
\TPase catalytic 
ubunit gene, 
omplete cds. 


0.59 


<NONE> 


<NONE> 


<NONE> 


321 


y 
( 
p 
g 

A J 1 32366 C 


telicobacter pylori 
strain PI) comB and 
mi/algA (partial) 
enes, and partial 
)RF1 and ORF2 


0.59 1 


<N0NE> 


<NONE> 


<NONE> 
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Nearest Neiuhbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant PrnreinO 


SEQ 
ID 


ACCESSION 


F DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Mus muscuius 










322 


U 17289 


LranoLTipiiun tacior 
AP-2 (AP-2) gene, 
alternative exon la, 
and i so form 2, partial 
cds. 


0.59 


2459419 


(AC002332) hypothetical 
protein [Arabidopsis thalianal 


9.4 


323 


Z71466 


S.cerevisiae 
chromosome XIV 
reading frame ORF 
YNLl90w 


0.59 


3875542 


(Z67990) Similarity to Rat 
ami lori de-sensitive sodium 
channel beta-subunit 


7.3 


324 


Z66493 


Beet soil-borne virus 
genes ror UJ\, Z£)\ 
and 48K proteins 


0.59 


2119867 


cryV465 protein - Bacillus 
thuringiensis thuringiensis] 


' 7.2 


325 


I A 1 151 

1-4 I JO 1 


Homo sapiens 
prostasin mRNA, 
complete cds 


0.59 


729212 


CRYSTALLIN J1C crystallin 
[Tripedalia cystophora] 


4.2 


326 


X79854 


S.lincolncnsis gene 
for 16S ribosomal 
RNA 


0.59 


3702828 


(AF056577) high mobility 
group protein 1.2 


3.2 


327 


AJ223356 


Strongylocentrotus 
purpuratus mRNA for 
SuDp98 protein 


0.59 


2495704 


HYPOTHETICAL PROTEIN 
KIAA0129 product is novel. 
[Homo sapiens] 


2.5 


328 


X86019 


H.sapiens mRNA for 
PRPL-2 protein 


0.59 


1743341 


(Y 10027) transcription factor 
TEF-1 [Mus muscuius] 


2.5 


329 


U75528 


Xiphias gladius 
creatine kinase gene, 
partial cds 


0.59 


1845995 


(U69477) envelope glycoprotein 
Human immunodeficiency virus 
type I] 


2.4 


330 


AC005573 i 


Homo sapiens 
chromosome 5, PAC 
:lone 202el3 


0.59 


2506366 


L>ly/\ rKJL. I iVLtK/Vot 

EPSILON SUBUNIT B DNA- 
directed DNA polymerase (EC 
2.7.7.7) II chain B - yeast. 
fSacch axom vces cerevisinel 

UVWI 1444 Will TV vsJ Wl W * IOI M W 1 

>gi|786319 (U25842) DNA 
Polymerase epsilon, subunit B 
(Swiss Prot. accession number 
P244S2) (Saccharomyces 
cerevisiael 


1.4 


331 


1 

L19180 


Rat receptor- linked 
protein tyrosine 
ahosphatase 


0.59 


1235974 i 


[X96713) collagen [Globodera 
pallida) 


1.1 


332 


J 
i 

L32090 i 


listeria 

nonocytogenes secA 
!ene, complete cds. 


0.59 


( 

2291129 1 


AF016415) No definition line 
bund (Caenorhabditis eleeans] 


0.83 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor fBlastX vs. Non-Redundant Proteino 


SEQ 
ID 


ACCESSION 


r DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Xenopus laevis 






(AL031124) hypothetical 




333 


U24433 


syndecan-2 mRNA, 
complete cds. 


0.59 


3355692 


protein SCIC2.25c 
[Streptomyces coelicolor] 


0.64 


334 


M23412 


Drosophila 
muscarinic 

acetylcholine receptoi 
mRNA, complete cds 


0.59 


168237 


(M76546) hydroxyprolinc-rich 
protein [Helianthus annuus] 


0.22 


335 


AF060729 


Synaphea media 
chloroplast atpB-rbcL 
intergenic spacer 
region, partial 
sequence 


0.59 


731596 


HVl J UIHhIIL'AL67.5 KD 

PROTEIN IN PRPS4-STE20 
INTERGENIC REGION 
>gi|626567|pir||S46825 
hypothetical protein YHLOlOc - 
yeast (Saccharornyces 
cerevisiae) >gi|2289881 
(U11582) No definition line 
found [Saccharornyces 
cerevisiae] 


0.16 


336 


AF029734 


Xanthobacter 
autotroph icus 
transcriptional 
activator AldR (aldR) 
gene, partial cds; and 
NAD-dependent 
chloroacetaldehyde 
dehydrogenase (aldB) 
gene, complete cds 


0.59 


2498801 


PERIAXIN 

>gi|2143901|pir||158I57 periaxin 
- rat >gi|505297 (Z29649) 
periaxin [Rattus norvegicus] 


0.13 


337 


X95307 


C.reinhardtii LI818r- 
[ gene 


0.59 


1723781 


H V PU I Hh 1 1UAL MTSTD 
PROTEIN IN TAF145-YORI 
INTERGENIC REGION 
>gi|213I717|pir||S646l2 
lypothetical protein YGR277c - 
yeast (Saccharornyces 
cerevisiae) 

>gi| 1 323505|gnl|PID|e243248 
[Z73062) ORF YGR277c 
Saccharornyces cerevisiae] 


le-04 


338 


] 
< 
< 

M24572 i 


Dicryostelium 
Jiscoideum tRNA- 
3lu-GAA gene, clone 
fGluGAAS. 


0.59 


I 

3 

1176186 ] 


HYPOTHETICAL 43.3 KD 
GTP- BINDING PROTEIN IN 
DACB-RPMA INTERGENIC 
REGION >gi|606 121 coli] 


3e-06 


339 


] 

U73733 € 


-luman hMSH6 gene. 
:xon 2 


0.59 


( 

2665637 ( 


AF0310S7) mismatch repair 
>roiein MSH6 [Mus musculus] 


5e-07 
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Nearest Neighbor (BlastN vs, Genbank) 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



SEQ 

ID I ACCESSION 



DESCRIPTION 



Escherichia coli 



P VALUE \ ACCESSION 



DESCRIPTION 



P VALUE 



340 



D90747 



genomic DNA. (25.2 
25.6 min) 



0.59 



134286 



DOLICHOL KINASE 



6e-08 



341 



342 



J052I1 



Human desmoplakin 
mRNA, 3' end. 



L24441 



Loligo pealii kinesin 
light chain mRNA, 
complete cds. 



0.59 



246796 



major centromere protein, 
CENP-B [human. Peptide, 594 
aa) 



0.59 



547800 



KINESIN LIGHT CHAIN 
(KLC) sea urchin 
(Strongylocentrotus purpuratus) 
>gi|161530 



4e-08 



5e-I4 



343 | M25I40 



■Juman cardiac alpha 
myosin heavy chain 
(MYH6) gene, exons 
2, 3 and 4. 



0.58 



344 I L81932 



Homo sapiens 
(subclone 9 Ji2 from 
PI H21) DNA 
sequence 



<NONE> 



<NONE> 



0.58 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



345 I AF087966 



Homo sapiens full 
length insert cDNA 
clone YU51G04 



0.58 



<NONE> 



<NONE> 



<NONE> 



.346 | Z78574 



H.sapiens flow-sortec! 
chromosome 6 TaqI 
fragment, 
SC6pA10Gll 



0.58 



<NONE> 



<NONE> 



<NONE> 



347 I AF068061 



Blattella germanica 
allatostatin 
neuropeptide 
precursor, gene, 
complete cds 



0.58 



<NONE> 



<NONE> 



<NONE> 



348 I AF015592 



349 | AF028006 



350 | ABO 17032 



Homo sapiens Cdc7 
(CDC7) mRNA, 
complete cds 



0.58 



<NONE> 



<NONE> 



Methanosarcina 
barkeri atp operon: 
ATP synthase beta 
subunit (atpD), ATP 
synthase epsilon 
subunit (atpC), ATP 
synthase gene 1 
(atpl), ATP synthase 
a subunit subunit (.., 



0.58 



3184291 



Mus musculus gene 
for pancreatic trypsin, 
complete cds 



0.58 



3170561 



(AC004136) putative DNA 
polymerase III gamma subunit 



(AF056704) synapsin Ilia 
[Rattus norvegicus] 



<NONE> 



9.4 



9.2 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Dictyostelium 










351 


AF081535 


discoideum 
developmental 
protein DG1I10 
(DG1110) gene, 
partial cds 


0.58 


105417 


basic proiine-rich peptide IB-8a 
human 


9.2 


352 


AF086322 


Homo sapiens full 
length insert cDNA 
clone 2D53E01 


0.58 


93026 


hypothetical protein - African 
swine fever virus (strain Malawi 
Lil-20/1) >gi|450758 (X71982) 
myeloid differentiation antigen 
homologue [African swine fever 
virus] >gi|903686 (M95672) 
unknown protein 


7.1 


353 


AF088025 


Homo sapiens full 
length insert cDNA 
clone ZC19C04 


0.58 


2384644 


(U92805) thrombospondin-3 
[Xenopus laevis] 


7.0 


354 


AB0O2339 


Human mRNA for 
KIAA0341 gene, 
partial cds 


0.58 


2135587 


Ml 30 antigen (cytosolic variant 
2) - human 


5.4 


355 


U67548 


Methanococcus 
jannaschii section 90 
of 150 of the 
complete genome 


0.58 


2911094 


(AL021957) hypothetical 
protein Rv2174 


4.2 


356 


L07868 


Homo sapiens 
receptor tyrosine 
kinase (ERBB4) 
gene, complete cds. 


0.58. 


461922 


PYKUVAik 

DECARBOXYLASE (8-10 NM 
CYTOPLASMIC FILAMENT- 
ASSOCIATED PROTEIN) 
(P59NC) 4. 1 . 1 . 1) - Neurospora 
crassa >gi[293948 (L09125) 
pyruvate decarboxylase 
Neurospora crassa] 
>gi| 1655909. 


■ 4.2 


357 


X03897 


Bacillus subtilis 
sigma 43 operon with 
P23-dnaE-rpoD genes 
(dnaE for DNA 
primase, rpoD for 
RNA polymerase) 


0.58 


1323704 


(U55387) similar to C. elegans 
F38E1.9 gene product encoded 
by GenBank Accession Number 
U41996 [Cricetulus griseus] 


4.1 


358 


D76419 


Desulfo vibrio 
vulgaris rbo gene for 
desulfoferrodoxin and 
rub gene for 
rubredoxin, complete 
cds 


0.58 


3420047 


(ACOO46S0) putative protein 
kinase [Arabidopsis thaliana] 


2.4 
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Nearest Neishbor (BlastN vs. Ccnbank) 


Nearest Neiehbor (BiastX vs. Non-Redundant Pro 


teins) 


SEQ 

id 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Human DNA 










359 


282174 


sequence from 
cosmid B20F6 on 
chromosome 22, 
complete sequence 
[Homo sapiens] 


0.58 


2145455 


(Y07866) catalase- peroxidase 


2.4 


360 


M33642 


F.soiani STI35 
protein gene, 
complete cds. 


0.58 


2896706 


(AL021897) hypothetical 
protein Rv 1069c 


2.4 


361 


U64873 


Mus musculus 
transforming growth 
factor alpha (TGF 
alpha) gene, partial 
cds 


0.58 


3874437 


(Z81038) predicted using 
Genefinder; cDNA EST 
yk488a2.5 comes from this gene 
Caenorhabditis elegans] 


1.8 


362 


AB002132 


Macrophthalmus 
banzai mitochondrial 
DNA for 12S and 
16S rRNA, partial 
and complete 
sequence 


0.58 


2960022 


(AJ224676) rho type GEF 
[Drosophila melanogaster] 


1,8 


363 


AF070070 


Caenorhabditis 
elegans MutS 
homolog (msh-5) 
mRNA. partial cds 


0.58 


4098205 


(U75869) Omp22 [Helicobacter 
pylori] 


1.8 


364 


AF045240 


Staphylococcus 
epidermidis plasmid 
pIP1629 mobilization 
protein (mobCl), 
(orf69-lMmobAl), 


0.58 


4218117 


(AL035353) protein (fragment) 


0.62 


365 


X61637 


H.sapiens Wilms 
tumor gene 1. exons 8 
and 9 


0.58 


2331059 


(U8821 1) unknown [Gallus 
gallus] 


0.62 


366 


AF039312 


Moraxella catarrhalis 
strain 4223 transferrin 
binding protein A 
(tbpA) and transferrin 
binding protein B 
(tbpB) genes, 
complete cds; and 
unknown ?ene 


0.58 


120155 


FIBER PROTEIN 
>gi|74229|pir||ERADFM fiber 
protein - mouse adenovirus 1 
>gi|209758 (M30594) fiber 
protein [Mastadenovirus musl] 


0.27 


367 


D87463 


Human mRNA for 
KIAA0273 gene, 
complete cds 


0.58 


3861477 


(U94 177) androgen receptor 
[Pan troglodytes] 


0.12 


368 


U40342 


Mus musculus ninein 
mRNA. compleie cds 


0.58 


4115936 


(AF 1 18223) No definition line 
found [Arabidopsis thaliana] 


0.004 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















369 


S57235 


<Jb6S=ll0kda 
transmembrane 
glycoprotein [human* 
promonocyte cell line 
U937, mRNA, 1722 
ml 


0.58 


2072501 


(U96U3) WWP1 [Homo 
sapiens] 


le-04 


370 


U39391 


Mus musculus 
serotonin LA receptor 
mRNA, complete cds. 


0.58 


1469876 


(D63481) The KIAA0147 gene 
product is related to adenylyl 
cyclase. [Homo sapiensl 


le-07 . 


371 


D0O056 


Monkey B- 
iymphotropic 
papovavirus genes for 
VP-1,2, 3 and large 
T antigen, complete 
and partial cds, strain 
LPV-76>:: 
gb|M14494|PPMVPl 
M Monkey B- 
lymphotropic 
papovavirus mutant 
(LPV-76) PstI B 
fragment encoding 
VPLVP2. VP3and 
T-antiaen. 


0.58 


2462069 


(AJ001774) vanadium 
chloroperoxidase 


le-08 


372 


M77I82 


Amsacta 
entomopox virus 
spheroidin gene, 
complete cds, and 
Pour vaccinia related 
orfs. > :: 

gb|I 16670|I 16670 
Sequence I from 
patent US 5476781 


0.58 


1730722 


H Y HI f 1 i-ll< 1 K Al d ^ >i k 1 1 "" ' 

PROTEIN IN NCE3-HHT2 
INTERGENIC REGION 
>gi|2131871|pir||S62957 
hypothetical protein YNL035c - 

I/O net Y*V*\F*"I V//-. ric* 

ycasi lodwtnarumyccs 
cerevisiae) 

>gi| 1 30 1 880|gnl|PID|e239670 
(271311) ORF YNL035c 
Saccharomyces cerevisiae] 


8c 14 


373 


S72579 


g loo-S= growth - 
associated protein 
GAP-43 homolog 


0.58 


2689720 


(AF037168) DnaJ homologue 
Arabidopsis thaliana] 


7e-14 


374 


1 

AF018165 ( 


[etraodon fluviatilis 
lmyloid precursor 
xotein mRNA, 
:omplete cds 


0.58 


3219938 


HYPOTHETICAL 34.9 KD 
PROTEIN C57A10.11C IN 
CHROMOSOME I 
>gi|2058378|gnl|PID|e3 14002 
x>mbe] 


5e-22 
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Nearest 


Neighbor (BlastN vs. Genbank) 


m Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


D \/Af TTYT 

r VALUc 
















375 


U8 1803 


Filobasidietla 
neoformans 
translation elongation 
factor EF1 -alpha 
(CnTEf 1) mRNA, 
complete cds 


0.57 


<NONE> 


<NONE> 


<NONE> 


376 


U0978L 


Candida albicans 
ATCC 18804, CBS 
562 peptide 
transporter gene, 
complete cds. 


0.57 


• • <NONE> 


<NONE> 


<NONE> 


377 


AC0O2143 


Homo sapiens 
(subclone 4_bl0 from 
BAC H102) DNA 
sequence 


0.57 


<NONE> 


<NONE> 


<NONE> 


378 


U23442 


Tetrahymena 
thermophilaRR 
internal deletion 
sequence. 


0.57 


<NONE> 


<NONE> 


<NONE> 


379 


U 17289 


Mus musculus 
transcription factor 
AP-2 (AP-2) gene, 
alternative exon la. 
and isoform 2, partial 
cds. 


0.57 


<NONE> 


<NONE> 


<NONE> 


380 


X70844 


Buzura suppressaria 
nuclear polyhedrosis 
virus gene for 
polyhedrin protein 


0.57 


<NONE> 


<NONE> 


<NONE> 


381 


] 
< 

AJ012159 } 


Homo sapiens 5T4 
oncofetal trophoblast 
jlycoprotein gene 


0.57 


<NONE> 


<NONE> 


<NONE> 


382 


I 
I 

X76571 c 


-I. sapiens simple 
DNA sequence region 
•lone wgla8. 


0.57 


<NONE> 


<NONE> 


<NONE> 



2>Sf 
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SEQ 
ID 



Nearest Neighbor (BlastN vs. Gcnbank) 



ACCESSION 



DESCRIPTION 



P VALUE 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



DESCRIPTION 



P VALUE 



AF034434 



"Viuiiochulerae 
pathogenicity island 
putative transposase 
aldehyde 
dehydrogenase 
(aldA), toxR- 
activated gene A 
protein (tagA), 
putative inner 
membrane protein, 
and putative zinc 
metalloprotcase 
genes, complete cds; 
and... 



0.57 



<NONE> 



<NONE> 



<NONE> 



AB017Q31 



Mus m use u I us gene 
for TESP4, complete 
cds 



0.57 



<NONE> 



<NONE> 



<NONE> 



X89788 



S.hispidus 
mitochondrial DNA 
for SSUribosomal 
RNA nene 



0.57 



L16921 



Rat progesteron 
receptor gene, 5* 
untr anslated region. 



<NONE> 



<NONE> 



<NONE> 



0.57 



3323116 



(AE001251) femA protein, 
j jutative [Treponema pallidum! 



8.9 



AF027292 



Homo sapiens 
interferon regulatory 
factor 6 



0.57 



259790 



(S48157) DNA polymerase- 
primase 180 kda subunit 
Drosophila melanogaster. 
Peptide, 1490 aa] 



6.7 



AJ01258I 



Cicer arietinum 
mRNA for 
ytochrome P450 



0.57 



2131498 



hypothetical protein YDR446w 
yeast CAI: 0.1 1 [Saccharomyces 
cerevisiae] 



5.3 



L15363 



AE00Q525 



AF020189 



Human transfer RNA 
Met (TRMEP1) 
pseudogene, complete 
gene 



0.57 



Helicobacter pylori 
26695 section 3 of 
134 of the complete 
senome 



3228680 



0.57 



193.8478 



Amblyomma 
americanum 
ecdysteroid receptor 
(AamEcR) mRNA. 
3'UTR. recion 1 



0.57 



2072224 



(AF070935) GABA receptor 
subunit [Musca domestica] 



5.2 



(U97008) weak similarity to 
family I of G-protein coupled 
receptors [Caenorhabditis 
e legans] 



4.0 



(U94875) p40 [Borna disease 
virusl • 



4.0 



-$5 V 
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Nearest Neiehbor (BlastN vs. Genbank) 


Nearest Neiehbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Human UbA52 gene 










392 


X56997 


coding for ubiquitin- 
52 amino acid fusion 
nrotein 

LFI III 


0.57 


29601 n 


(AL022 12 1) hypothetical 

nmrcin Du^ASQ 
piULCin I\VJQo7 


a n 


393 


AL010260 


Plasmodium 
falciparum DNA *** 
SEQUENCING IN 
PROGRESS *** 
from contig 4-81, 
complete sequence 


0.57 


117233 


LYTUL'HKOMb P450 2CI4 
(CYPIICI4) phenobarbital- 
inducible, hepatic • rabbit P-450 
lOrvctolacus cuniculusl 
>gi [3 5 82 65 1 prfti i 3063 1 7 A 
cytochrome P450 [Oryctolagus 
cuniculus] 


3.9 . 


394 


M99581 


Xenopus laevis 
gamma-crystal lin 
(gcry3) gene, 
complete cds. 


0.57 


141647 


GASTRULA ZINC FINGER 
PROTEIN XLCGF44 
>gi|85736|pir||S06571 finger 
protein (clone XlcGF44-2) - 
African clawed frog (fragment) 


3.0 


395 


M38384 


Drosophila 
melanogaster seven in 
absentia mRNA, 
complete cds. 


0.57 


1707127 


(U80454)T16A1.1 
Caenorhabditis elegans] 


3.0 


396 


U32795 


Haemophilus 
influenzae Rd section 
110 of 163 of the 
complete genome 


0.57 


1173433 


IRON(IIl)-TRANSPORT 
SYSTEM PERMEASE 
PROTEIN SFUB >gi| 152861 
(M33815) protein (sufB) 


2.3 


397 


X12600 


(lebsiella 
pneumoniae nifX, 
nifU, nifS, nifV and 
nifW genes 


0.57 


2909562 


(AL021925) hypothetical 
protein Rv2256c 


L4 


398 


AB014526 


Homo sapiens mRNA 
forKIAA0626 
protein, complete cds 


. 0.57 


482390 


insect-stage-specific protein - 
Trypanosoma cruzi >gi| 162099 
(M65021) insect stage-specific 
antigen 


■ 0.61 


399 


AF063587 


thodococcus fascians 
strain NRRL-B- 
15096 hypothetical 
protein gene, 
complete cds 


0.57 


4104321 


(AF0345S2) vesicle associated 
protein [Rattus norvegicus] 


0.46 


400 


LI1117 


Guinea pig estrone 
iulfotransferase £ene. 


0.57 


82584 


ilpha/beta-gliadin precursor 
'clone A212) - wheat 


0.35 
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Nearest 


Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Protein*) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















401 


V00829 


Mouse complete gene 
for a mouse kallikrein 
gene. Genes are mGK 
I (complete gene) 
and mGK-2 of 
hormones, e.g., 
grow... > :: 
gb|J00390|MUSKAL 
07 Mouse pseudo- 
kail ikre in 2, exons 4 
and 5, and kallikrein 
1 gene, complete cds. 


0.57 


2500916 


NUCLEAR HORMONE 
RECEPTOR NOR-2 receptor 
[Rattus norvegicus] 
>gi|1583604|prfl|2121281A 
NOR-2 protein [Rattus 
norvegicus] 


0.20 


402 


X53092 


Chicken mRNA for 
beta-2 subunit of 
neuronal nicotinic 
acetylcholine receptor 


0.57 


1072256 


(U40953) similar to matrin F/G 
\or .v^uuviu/ containing i^^f- 
type zinc- fingers 
Caenorhabditis eleeans] 


0.031 


403 


L07939 


Ovis ovis granulocyte 
colony stimulating 
factor 


0.57 


3874345 


IVD) predicted using 
Genefinder; Similarity to 
dehydrogenases; cDNA EST 
EMBL:D65800 comes from this 
gene; cDNA EST 
EMBL:D76184 comes from this 
gene; cDNA EST 
EMBL:D69322 comes from this 
gene; cDNA EST 
EMBL:C08158 comes f... 


3e-07 


404 


U18061 


Colletotrichum 
gloeosporioides 
CAP20 (cap20) gene, 
complete cds. 


0.57 


2914695 


(AC003974) putative ubiquitin 
specific protease 


9e-08 


405 


] 
i 
1 

273955 I 


Ljaponicus mRNA 
For small GTP- 
binding protein, 
*ABUG 


0.57 


j 
] 

I 

112894 < 


i UMUK NfcCkUMS hAC I UK, 
ALPHA-INDUCED PROTEIN 
3 (PUTATIVE DNA BINDING 
PROTEIN A20) (ZINC 
FINGER PROTEIN A20) 
>gi|107549|pir||A35797 
probable DNA-binding protein 
M0- human >gi| 177S66 
M59465) A20 


7e-0S 



3^ 
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Nearest Neighbor fBlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















406 


X04335 


Petunia grp-1 gene 
for glycine-rich 
protein 


0.57 


3876901 


{Zl/bbU) Similarity to Human 
enoyl-CoA hydratase 
(S W : EC HM_HUMAN) ; cDNA 
EST EMBL:T006H comes 
from this gene; cDNA EST 
yk203dl0.3 comes from this 
gene; cDNA EST yk203dl0.5 
comes from this gene; cDNA 
EST yk457h5.3 comes from t... 


le-27 


407 


U40718 


Rattus norvegicus S- 

adenosylmethionine 

decarboxylase 

(AMDP2) 

pseudogene 


0.56 


<NONE> 


<NONE> 


<NONE> 


408 


M60318 


S.cerevisiae SSDl 
protein gene, 
complete cds. > :: 
gb|AR013983|AR0l3 
983 Sequence 8 from 
patent US 5773245 


0.56 


<NONE> 


<NONE> 


<NONE> 


409 


X60057 


Nicotiana tabacum 
blp4 mRNA for 
luminal binding 
protein (BtP) 


0.56 


<NONE> 


<NONE> 


<NONE> 


410 


AF035930 


Homo sapiens full 
length insert cDNA 
clone YR55A09 


0.56 


<NONE> 


<NONE> 


<NONE> 


411 


ALO 10189 


Plasmodium 
falciparum DNA *** 
SEQUENCING IN 
PROGRESS *** 
from contig 3-102, 
complete sequence 


0.56 


<NONE> 


<NONE> ' 


<NONE> 


412 


X05402 


Murine G-CSF gene 
for granulocyte 
colony stimulating 
factor precursor 


0.56 


<NONE> 


<NONE> 


<NONE> 


413 


U92280 


Rattus norvegicus 
regulator of G-protein 
signalling 12 
(RGS12) mRNA, 
complete cds 


0.56 


<NONE> 


<NONE> 


<NONE> 


414 


US5660 


Human 

papillomavirus strain 
RTRX7 complete 
aenome 


0.56 


<NONE> 


<NONE> 


<NONE> 



3SS 
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Nearest Neiahbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Pre 


teins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















415 


X57626 


M. javanica 
mitochondrion 
ATPase 6, and 
putative tRNA-f-Met 
and tRNA-His genes 


0.56 


<NONE> 


<NONE> 


<NONE> 


416 


AB003363 


Sus scrofaSlOOC 
gene, complete cds 


0.56 


<NONE> 


<NONE> 


<NONE> 


417 


L42291 


Danio rerio DANA 
element, intron 4. 


0.56 


2650002 


(AE001062) conserved 
hypothetical protein 
[Archaeoglobus fulgidusl 


8.7 


418 


AF031826 


Mus musculus 
leukocy statin gene, 
complete cds 


0.56 


462493 


L-LACiAib 
DEHYDROGENASE 
(IMMUNOGENIC PROTEIN 
P36) >gi|479296|pir||S33362 L- 
lactate dehydrogenase (EC 
1.1.1.27) - Mycoplasma 
hyopneumoniae 


6.7 


419 


U 17068 


Pennisetum glaucum 
Ac-like element, 
AcL2. 


0.56 


399449 


ESC ARGOT/SNAIL PROTEIN 
HOMOLOG 


6.7 


420 


Z48042 


H.sapiens mRNA 
encoding GPI- 
anchored protein 
pL37 


0.56 


141232 


HYPOTHETICAL 8.7 KD 
PROTEIN (READING FRAME 
D) >gi|763 1 6[pir||QQS A7C 
hvDothetical protein E-74 


6.7 


421 


AF027657 


Chonstoneura 
fumiferana 
entomopoxvirus 
nucleotide 
triphosphate 
phosphohydrolase I 
(NPHI) gene, 
complete cds 


0.56 


464999 


PUTAl'lVk 
ACETYLCHOLINE 
REGULATOR UNC-1S 
>gi|480359|pir||S36747 
acetylcholine regulator unc-18- 
Caenorhabditis elegans 
>gi|247392|bbs| 100294 putative 
acetylcholine reaulator unc-18 


5.1 


422 


AB011540 


Homo sapiens mRNA 
for MEGF7, partial 
cds 


0.56 


1718033 


URACIL-DNA 
GLYCOSYLASE (UDG) 
herpesvirus 2 >gi|6952 19 
(U20824) uracil DNA 
glvcosylase 


5.1 


423 


X59941 


X.maculatus NGF 
gene for nerve growth 
factor 


0.56 


1169081 


COMMON PLANT 
REGULATORY FACTOR 
CPRF-i >gi|515621 (X58575) 
light-inducible protein CPRF-i 
[Petroselinum crispum] 
>ei|!49S30i (U46217) CPRF1 


3.8 
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Nearest 


Neighbor f BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant PmfPinc\ 


SEC 


) 

ACCESSIOr 


A DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















424 


M72711 


repressor of my el in- 
specific genes (SCIPj 
mRNA, complete cds 


> 

0.56 


501027 


(U0I849) ORF2 [Trypanosoma 
brucei] 


2.3 


425 


AL023850 


Caenorhabditis 
elegans cosmid 
Y67D1IA, complete 
sequence 
[Caenorhabditis 
elegans] 


0.56' 


266771 


LHUKlSMAlh MUTA5E 

(CM) J PREPHENATE 

flPPYnO ATA CU /DT > iT> /"D 

PROTEIN) 

>gi|281791|pir||S26053 
chorismate mutase (EC 5.4.99.5] 
P / prephenate dehydratase (EC 
4,2. 1 .5 1 ) - Erwinia herbicola 
>?i|43344 


) 

2.3 


426 


U47862 


Schistosoma mansoni 
gynecophoral canal 
protein mRNA, 

pnmnlpt^ pHq 




1 14713o 


ATP synthase chain 6 - 
Platymonas subcordiformis 
mitochondrion >gi|633582 
(Z47797) ATP synthase subunit 
6 [Platymonas subcordiformis] 


2.3 


427 


V00574 


Human germ line 
gene homologous to 
bladder carcinoma 
oncogene T24 (Gene 

CftHp (*- rll-MC— t \ with 

four exons. 


0.56 


1518672 


(U60289) receptor protein 
tyrosine phosphatase psi (Homo 
sapiens] 


1.7 


428 


Z71502 


gene 


0.56 


1651674 


(D90899) femchrome-iron 
receptor 


1.3 


429 


M37278 


R.norvegicus renin 


U.30 


2853019 


(AF045141) putative serine 
proteinase [Scirpophaga 
incertulas] 


1.0 


430 


1 

D28878 c 


Therm us 

thermophilus polA 
aene for thermostable 
DNA polymerase I, 
romplete cds 


0.56 


< 

3659692 


AF06874S) sphingosine kinase 
Mus musculus] 


0.77 


431 


] 

Z15027 I 


-I.sapiens HLA class 
II DNA 


0.56 


1304141 c 


D43758) fibrinogen A-alpha- 
*hain 


0.76 


432 


I 

a 

MJ4362 n 


-luman T-cell surface 
ntigenCD2 (Til) 
nRNA. complete cds. 


0.56 


( 

2462979 t 


Y 11915) Tenascin-X [Bos 
aurus] 


0.59 


433 


2 
c 

Z50S01 b 


'.mays mRNA for 
hlorophyll a/b- 
inding protein CP29 


0.56 


c 

109677 n 


ollagen alpha 1(1) chain - 
louse >2i|50487 


0.50 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor fBlastX vs. Nnn-Redundnm PmtPinc^ 


SEQ 
ID 


ACCESSION 


I DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












HVPOTHkTlt'AL 86.0 KL 




434 


Z381I4 


S. cere vis iae 

rhrnmnvimj* YT Tl 

kill wIUUMJIIlC ^Xlll 

cosmid 9745 


0.56 


140372 


PROTEIN IN CLK1-Sft09 
INTERGENIC REGION 
>gi|83I59|pir||S 19367 
hypothetical protein YCL039w - 
yeast (Saccharomyces 
cerevisiae) 


0.35 


435 


AF052254 


Escherichia coli DNA 
gyrase A CgyrA) gene 
panial cds 


0.56 


2724126 


(AF038535) synaptotagmin VU 
[Homo sapiens] 


0.12 


436 


AF080649 


J egula pulligo 1 2i> 
small subunit 
ribosomal RNA gene, 

mifnchondrinl apnA 

for mitochondrial 
RNA, partial 
sequence 


0.56 


3913223 


CYCLIN-DEPENDENT 
KINASE INHIBITOR I 
p2i/WAFl [Feliscatus] 


0.1 1 


437 


AJ005690 


Danio rerio mRNA 
iur pruicin tyrosine 
kinase 


0.56 


2623830 


(AF030962) unknown 
[Schistosoma mansoni] 


7e-06 


438 


U31202 


Human noggin 
(NOGGIN) gene, 

lUIIipiCLC LUb. 


U.DO 


Jo/04/5 


(Z78411)F02D8.3 
Caenorhabditis elegans] 


le-06 


439 


X51 695 


Ovis sp. trichohyalin 
mRNA, partial 


0.56 


3386622 


(AC0O4665) unknown protein 
Arabidopsis thaliana] 


Ie-10 


440 


U28938 


Rattus norvegicus 
protein tyrosine 
phosphatase D30 
iiijvLx/\, Lurnpicic cus 




3293547 


(AF072709) putative 
oxidoreductase [Streptomyces 
lividans] 


le-14 


441 


AEO0H7I 


Borrelia burgdorferi 
(section 57 of 70) of 
[he complete genome 


0.56 


2315521 


[AF016452) similar to the beta 

ran^dnrin fnmilv 




442 


< 

AF036685 < 


Caenorhabditis 
Regans' cosmid 
:05BI0 


0.56 


i 
{ 

1519671 t 


r U67951) contains similarity to 
fVTP/GTP-binding site motif v 
PS:PS00017) [Caenorhabditis 
ilesans] 


6e-20 


443 


> 

X01173 i 


<enopus laevis 
vitellogenin gene Al 
>' Hanking region 


0.56 


( 
F 

1118102 c 


U41558) K02B2.3 gene 
>roduct [Caenorhabditis 
legans] 


2e-31 


444 


I 
f 

D109U c 


tfus muscuius DNA 
or MS2 protein, 
omplete cds 


0.55 


<NONE> 


<NONE> 


<NONE> 
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Neighbor (BlastN vs. Genbank) 


_ Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSIOI* 


i DESCRIPTION | p VALUE 


ACCESSION 




P VALUE 


445 


D30010 


Rice mRNA EN 117, 
partial sequence 


0.55 


<n U£Nb> 


<NONE> 


<NONE> 


446 


U51991 


Escherichia coli 
phosphoprotein 
phosphatase 


0.55 


<INUiNc> 


<NONE> 


<NONE> 


447 


Ml 8858 


Mouse T cell recepto 
C-gamma-7. 1 mRNA 
3' end. 


r 

0.55 


<NONE> 


<NONE> 


<NONE> 


448 


U95218 


Homo sapiens T cell- 
death associated 
protein gene, 
complete cds 


0.55 


- <NONE> 


<NONE> 


<NONE> 


449 


M14948 


Human R-ras gene, 
exon 1. 


0.55 


<NONE> 


<NONE> 


<NONE> 


450 


AB002353 


Human mRNA for 
KIAA0355 gene, 
complete cds 


0.55 


<NONE> 


<NONE> 


<NONE> 


451 


L81689 


Homo sapiens 
(subclone l_d6 from 
PI H54) DNA 
sequence 


0.55 


<NONE> 


<NONE> 


<NONE> 


452 


M68955 


Human myristoylated 
alanine-rich C-kinase 
substrate (MACS) 
gene, 5' end. 


0.55 


3322710 


(AE0OI22O) V-type ATPase, 
subunit B (atpB-1) [Treponema 
pallidum] 


5.0 


453 


X62953 


R.norvegicus mRNA 
(pJGl 16) with 
repetitive elements 


0.55 


1076802 


extensin-like protein - maize 
>gi|600118 mays] 


5.0 


454 


( 
< 
i 
( 

L34630 c 


aynecnocystis sp. 
mntABC transporter 
system: periplasmic- 
>inding protein 
mntC), complete cds; 
I mm A) gene, 
:omplete cds; 
nembrane protein 
mntB) gene, 
'omplete cds. 


0.55 


( 

2117632 


hydrogen dehydrogenase (EC 
1.12.1.2) -Clostridium 
icetobutylicum >gi|557064 
U 15277) hydrogenase I 
Clostridium acetobutyltcum] 


5.0 


455 


I 

r 
F 

U43521 c 


Plasmodium berghei 
nerozoite surface 
>rotein-l gene, 
omplete cds 


0.55 


127654 p 


MYOGLOBIN 


4.9 
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Nearest Neighbor (BbstN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















456 


Z64937 


H.sapiens CpG DNA, 
clone 17g7, reverse 
read cpgl7g7.rtla . 


0.55 


417298 


MFS 18 PROTEIN 
PRECURSOR 


3.8 


457 


U10914 


Macaca mulatta clone 
irh83 T-cell receptor 
alpha chain mRNA, 
partial cds. 


0.55 


310406 


(L092I2) tat protein [Simian 
immunodeficiency virus] virus] 


3.8 


458 


AF022838 


Homo sapiens 
multidrug resistance 
protein 


0.55 


1585251 


traB gene [Arnycolatopsis 
methanolica] 


2.8 


459 


M35603 


Mouse Hox-3. 1 gene 
and Hox-3.2-Hox-3.1 
intergenic region. 


0.55 


818849 


(U25430) nucleotide 
pyrophosphatase precursor 
[Oryza sativa] 


2.0 


460 


AE00I395 


Plasmodium 
falciparum 
chromosome 2, 
section 32 of 73 of 
the complete 
sequence 


0.55 


137532 


PROTEIN C2 

>gt|74386|pir||WZV2B6 59K 
Hindlll-C protein - vaccinia 
virus (strain WR) 


1.7 


461 


AE00I395 


Plasmodium 
falciparum 
chromosome 2, 
section 32 of 73 of 
the complete 
sequence 


0.55 


137532 


PROTEIN C2 

>gi|74386|pir||WZVZB6 59K 
Hindlll-C protein - vaccinia 
virus (strain WR) 


1.7 


462 


U59736 


Human transcription 
factor (NFATc.b) 
mRNA, complete cds 


0.55 


3327144 


(ABO 14565) KIAA0665 protein 
[Homo sapiens] 


0.096 


463 


U34860 


Saccharomyces 
cerevisiae origin 
recognition complex 
large subunit (ORC1) 
gene, complete cds 


0.55 


140372 


HYPOTHETICAL 86.0 KD 
PROTEIN IN GLKl-oROy 
INTERGENIC REGION 
>gi|83159|pir||S 19367 
hypothetical protein YCL039w - 
yeast (Saccharomyces 
cerevisiae) 


0.017 


464 


AF012341 


Homo sapiens 
glutaryl-CoA 
dehydrogenase 
(GCDH) gene, exons 
6,7. 8, 9, and 10 


0.55 


1166611 


(U46674) coded for by C. 
clegans cDNA yk27d9.5; coded 
for by C, elegans cDNA 
yk27d9.3; short region of weak 
homology to drosophilia 
suppressor of sable protein 


O.OOS 



**** A 

£*6 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 






P VAT TTP 
r VALUE 






P V AT T TV 






HI V-l isolate 










465 


AF0O4891 


CxA from Kenya, 
envelope 

glycoprotein C2V3 
region (env) gene, 
partial cds 


0.54 


<NONE> 


<NONE> 


<NONE> 




Y10159 


D.discoideum 
racGAP gene 


0.54 


<NONE> 


<NONE> 


<NONE> 


467 


a onn i one 


Homo sapiens mRNA 
for B 120, complete 
cds 


0-54 


<NUNfc> 




<INUEMc> 


468 


X12357 


Bovine gene tor 
aspartyl protease 
NM 1 exons 3 and 4 > 
:: ld|X12357 Bovine 
aspartyl protease 
NM1 gene, exons 3 
and 4. 


0.54 


<NONE> 


<NONE> 


<NONE> 


469 


AE001151 


Borrelia burgdorferi 
(section 37 of 70) of 
the complete genome 


0.54 


<NONE> 


<NONE> 


<NONE> 


470 


X92052 


H.sapiens mRNA for 
T cell receptor alpha 
chain 


0.54 


<NONE> 


<NONE> 


<NUiNh> 


471 


U00938 


Mus musculus ileal 
lipid-binding protein 
gene, complete cds 


0.54 


1009712 


(U27698) calreticulin 
[Arabidopsis thalianal 


4.9 


472 


X68367 


M.thermoformicicum 
complete pjusrniu 
pFZi DNA 


0.54 


125272 


i'ASIklN KlNA^fc ll ALPHA 
CHAIN (CK II) 
>gi|419938|pir||A43297 casein 
kinase II (EC 2.7.1.-) alpha 
chain - Theileria parva 
>gi|161871 (M92084) casein 
Kinase 11 uipua suuuiiii 
[Theileria parva] 


4.7 


473 


Z61098 


H.sapiens CpG DNA, 
clone 44c4, reverse 
read cpg44c4.rtla . 


0.54 


4191274 


(AJI31094)Xvent-lB protein 
Xenopus laevis] 


3.7 


474 


M63962 


Human gastric H.K- 
ATPase catalytic 
subunit gene, 
complete cds. 


0.54 


388164S 


(Z70757) similar to serine 
protease inhibitor 
'Caenorhabditis elegans] 


3.7 


475 


XS6019 


H.sapiens mRNA for 
PRPL-2 protein 


0.54 


164882S 


(D87963) ETF-related factor- 1 
(ETFR-1) 


2.1 
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Nearest Neighbor (BlasiN vs. Genbank) 


Nearest Neighbor fBIastX vs. Non-Redunrfnm Prnrfinsl 


SEQ 


ACCESSION 


f DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






S.glaucescens genes 










476 


X89010 


strU, strX, sirV and 
strW for 5'- 
hydroxystreptomycin 
pruduction and 
transport 
polypeptides 


0.54 


3550345 


(AF084524) cellular repressor 
of EiA-stimulated genes CREG 
[Mus musculus] 


0.25 


477 


AB007836 


Homo sapiens mRNA 
for Hic-5. partial cds 


0.54 


1097213 


ORF 1 [Streptomyces 
lavendulae] 


0.15 


478 


U32622 


Comamonas 
testosteroni TsaR 
(tsaR), 

toluenesulfonate 
methyl- 

monooxygenase 
oxygenase component 
component (tsaB), 
toluenesulfonate zinc- 
indepedent alcohol 
dehydrogenase... 


0.54 


3875351 


(296047) DY3.6 
[Caenorhabditis elegans] 


0.006 


479 


D61394 


Arabidopsis thaliana 
gene for beta-VPE, 
complete cds 


0.53 


<NONE> 


<NONE> 


<NONE> 


480 


D61394 


Arabidopsis thaliana 
gene for beta-VPE, 
complete cds 


0.53 


<NONE> 


<NONE> 


<NONE> 


481 


233072 


M.capricolum DNA 
for CONTIG MC097 


0.53 


<NONE> 


<NONE> 


<NONE> 


4S2 


U45975 


Human 

phosphatidylinositol 
(4,5)bisphosphate 5- 
phosphatase homolog 
mRNA, partial cds. 


0.53 


<NONE> 


<NONE> 


<NONE> 


483 


< 
i 

Z71324 < 


S.cerevisiae 
:hromosome XIV 
eading frame ORF 
*"NL04Sw 


0.53 


1 

2135586 


VI 130 antigen (cytosolic variant 
i) - human 


2.1 


484 


I 
r 

L32090 


-isteria 

nonocytogenes secA 
lene, complete cds. 


0.53 


( 

2291129 f 


AF016415) No definition line 
bund [Caenorhabditis elegans] | 


0.70 
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ID 


Nearest 
ACCESSION 


Neighbor fBlastN vs. ( 

* DESCRIPTION 
Mus musculus mRN/ 


3enbank) 1 

P VALUE 1 
V 1 


Nearest Neigh 
ACCESSION 


bor (BlastX vs. Non-Redundant P 
DESCRIPTION 


rote ins) 
P VALUE 


485 


D86423 


for HGT keratin, 
partial cds 


0.53 


1235974 


(X96713) collagen [Globodera 
pallidal 


0.41 


486 


Y 15969 


Mus musculus V 
kappa 2 1-6 gene, 
partial 


0.52 1 


<NONE> 


<NONE> 


<NONE> 


487 


M27480 


Mus musculus (clone 
3F9) transcribed 
germline T cell 
receptor gamma chair 
(Tcr-g) mRN A, VJ4 
C4 region. 


1 

0.52 1 


3875542 


(267990) Similarity to Rat 
ami loride- sensitive sodium 
channel beta-subunit 


4.6 


488 


D87004 


Human (lambda) 
DNA for 

immunogloblin light 
chain 


052 


1766073 


(U37272) winged helix protein 
CWH-1 [Gallus gallus] 


3.5 


489 


Z99704 


Human DNA 
sequence from 
cosmid E75B8 on 
chromosome 22, 
complete sequence 
[Homo sapiens] 


0,51 


<NONE> 


<NONE> 


<NONE> 


490 


U76523 


Sambucus nigra lectin 
precursor mRNA, 
complete cds 


0.51 


<NONE> 


<NONE> 


<NONE> 


491 


U32795 


Haemophilus 
influenzae Rd section 
110 of 163 of the 
complete genome 


0.50 J 


<NONE> 


<NONE> 


<NONE> 


492 


Ml 4602 


Human myoglobin 
sene, exon 2. 


0.49 I 


478384 


helicase homolog glOL protein - 
African swine fever virus 
>gi|4 14091 (X72951)G10L 125 
KDa protein 


7.0 


493 


I 
I 

D87075 f 


iurnan mRNA for 
CIAA0238 gene, 
►artial cds 


0.24 


i 

r 

( 
I 

1938429 e 


uy/002) similar to 
Schizosaccharomyces pombe 4- 
litrophenylphosphatase 
PNPPASE) (SP;Q00472, 
*ID:g5004) [Caenorhabditis 
legans] 


2.5 


494 


n 
P 

U95102 n 


Cenopus laevis 
litotic 

hosphoprotein 90 
iRNA. complete cds 


0.23 | 


<NONE> 


• <NONE> 


<NONE> 



30 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






N.crassa 










495 


J05254 


mitochondrial small 
(I9S) rRNA and Cys- 
tRNA. 


0.23 


192150 


(L05670) clustrin [Mus 
musculus] 


5.1 


496 


X 16399 


Gene for glutamate 
dehydrogenase (EC 
1.4.1.4), put. bacterial 
origin 


0.23 


790933 


(L07867) invariant surface 
glycoprotein [Trypanosoma 
brucei] 


0.030 


497 


AE00I251 


Treponema pallidum 
section 67 of 87 of 
the complete genome 


0.22 


<NONE> 


<NONE> 


<NONE> 


498 


AF026919 


Homo sapiens 
amyloid lambda light 
chain variable region 
mRNA, partial cds 


0.21 


<NONE> 


<NONE> 


<NONE> 


499 


Z27247 


D.melanogaster 
mRNA for defensin 


0.21 


<NONE> 


<NONE> 


<NONE> 


500 


Y 15608 


Candida albicans 
UBI3 sene 


0.21 


<NONE> 


<NONE> 


<NONE> 


501 


V00598 


Human beta-tubulin 
pseudogene. 


0.21 


<NONE> 


<NONE> 


<NONE> 


502 


X79426 


A.thaliana 
microsatellite 
[repeated motif 
(sat)71 


0.21 


<NONE> 


<NONE> 


<NONE> 


503 


X75772 


A.caerulescens 
mitochondrial genes 
for cytochrome b and 
NADH 

dehydrogenase 5 


0.21 


139626 


PROTEIN Tl PRECURSOR 


7.S 


504 


AF028736 


Serratia marcescens 
site specific 
recombinase 


0.21 


3645960 


(AL031583) 1- 

evidence=predicted by content; 
l-method=genefinder;084; 1- 
method_score=47.46; 1- 
evidence_end; 2- 
evidence=predicted by match; 2- 
match_accession=SWISS- 
PROT:P23792; 2- 
match_description=DISCONNE 
CTED PROTEIN.; 2-matc... 


4.6 


505 


X97545 


S.cerevisiae OST5 
gene 


0.21 


2275631 


(AF014940) No definition line 
Found [Caenorhabditis elegans] 


2.7 



>4 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















506 


M24543 


Human prostate- 
specific antigen (PA) 
gene, complete cds. 


0.21 


1938527 


(U97012) C04E6.2 gene 
product [Caenorhabditis 
eleaansl 


2.7 


507 


M62470 


Mouse 

thrombospondin 
(THBSl)gene. 
complete cds. 


0.21 


548563 


RKTaREPLICaSE 

rvJLYPROTElN 2.7.7.48) - 
Erysimum latent virus 
>gi|3892232 (AF098523) 
replicase protein [Erysimum 
latent virus] 


2.1 


508 


Y I 3544 


Homo sapiens cos mid 
CI 


0.21 


1235710 


(L40584) polyprotein 
[Infectious pancreatic necrosis 
virus] 


2.0 


509 


M24193 


Chicken MHC B 
complex protein (C12 
3) mRNA, complete 
cds. 


0.21 


3600102 


(AF090441) extracellular reelin 
[Gallus gallus] 


0.52 


510 


X97161 


H.sapiens TFE3 gene, 
exon 4,5 & 6 


0.21 


854065 


(X83413) U88 [Human 
herpesvirus 6] 


0.30 


511 


X67649 


R.norvegicus DNA 
sequence for 
LFB1/HNF1 
promoter 


0.21 


3913114 


TRANSCRIPTION FACTOR 
COUP 2 COUP-TFII - chicken 
>gi|392817 (U00697) orphan 
receptor COUP-TFII [Gallus 
gallus] 


0.004 


512 


U63807 


Fugu rubripes growth 
hormone (GH) gene, 
complete cds 


0.21 


3510505 


(AF030881) pol polyprotein 
Fugu rubripes] 


3e-04 




Z95636 


H.sapiens mRNA for 
laminin alpha 5 chain 


0.21 


400350 


NAM7 PROTfcIN (NONSENSK 

PROTEIN 1) (UP- 
FRAMESHIFT SUPPRESSOR 
I) factor NAM7 - yeast 
(Saccharomyces cerevisiae) 
>gi|4023 


le-07 


514 


U91907 


Vlirounga leonina 

histocompatibility 
complex class II 
(DQA) gene, partial 
cds 


0.20 


<NONE> 


<NONE> 


<NONE> 


515 


Z35758 


Transmissible 
gastroenteritis virus 
TFI virion protein 
genes 


0.20 


<NONE> 


<NONE> 


<NONE> 


516 


X00334 


Drosophila virilis 
simple DNA 
sequence (pDv-19) 


0.20 


<NONE> 


<NONE> 


<NONE> 



3^5 



WO 01/02568 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor fBlastX vs. Non-Redundant Proteins^ 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















517 


M76741 


Homo sapiens biliary 
glycoprotein (BGP) 
gene, partial cds. 


0.20 


<NONE> 


<NONE> 


<NONE> 


518 


D78515 


Mus musculus rae28 
gene, exon 1 and 
5'flanking region 


0.20 


<NONE> 


<NONE> 


<NONE> 


519 


M62975 


Drosophila 
melanogaster RNA 
polymerase II second 
largest subunit 
upstream (DmRP 
140) gene, exons 1-4. 


0.20 


<NONE> 


<NONE> 




520 


M27260 


Chicken 78-kD 
glucose-regulated 
protein, complete cds. 


0.20 


<NONE> 


<NONR> 




521 


AF076470 


Rice tungro 
bacilliform virus 
Serdang strain, 
complete genome 


0.20 


<NONE> 


<NONE> 


<NONE> 


522 


AF076470 


Rice tungro 
bacilliform virus 
Serdang strain, 
complete genome 


0.20 


<NONE> 


<NONE> 


<NONE> 


523 


U04636 


Human 

cyclooxygenase-2 
(hCox-2) gene, 
complete cds. 


0.20 


<NONE> 


<NONE> 


<NONE> 


524 


AE001430 


Plasmodium 
falciparum 
chromosome 2, 
section 67 of 73 of 
the complete 
sequence 


0.20 


<NONE> 


<NONE> 


<NONE> 


525 


1 

I 

( 

AF043514 c 


Vfus musculus 
ahosphomannomutase 
Pmm2) mRNA, 
:omplete cds 


0.20 


3025006 | 


HYPOTHETICAL iS.5 Kb 
PROTEIN IN MOAE-RHLE 
[NTERGENIC REGION 
>gi| 1787009 (AE000181)orf, 
lypothetical protein 
Escherichia coli] 


9.S 



WO 01/02568 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


f DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















526 


U23144 


Xenopus laevis FTZ- 
Fl-related nuclear 
orphan receptor 
variant (xFFlrAshort) 
rnRNA, complete cds 


0.20 


3184402 


(AB014477) period protein 
[Chymomyza costata] 


9.6 


527 


U14621 


Paracentrotus lividus 
Pax-6 (suPax-6) 
mRNA, complete cds. 


0.20 


465894 


PROBAHLh MlLRUSUIvlAT" 
SIGNAL PEPTIDASE 23 KD 
SUB UNIT (SPC22/23) 
>gi|630688|pir||S44854 
K12H4.4 protein - 
Caenorhabditis elegans 
>gi|289708 (L14331) homology 
with signal peptidase; coded for 
by C. elegans cDNAs GenBank: 
M79661, M79662 and M79663; 
putative 


7.7 


528 


AF0305H 


Acti no bacillus 
pleuropneumoniae 
MRP ATPase 
homolog (mrp) gene, 
partial cds; ApxIVA 
var3 (apxIVA) gene, 
complete cds; and 
beta-galactosidase 
(lacZ) gene, partial 
cds 


0.20 


1175966 


HYPOTHETICAL 45.3 BCD 
PROTEIN IN THI5 5*REGION 
>gi|I084720|pir||S56l93 
probable membrane protein 
YFL062w - yeast 
(Saccharomyces cerevisiae) 


7.2 


529 


AF070581 


Homo sapiens clone 
24540 mRNA 
sequence 


0.20 


542394 


glyoxal oxidase (EC 1.2.3.-) 
precursor - basidiomycete 
(Phanerochaete chrysosporium) 
>gi| 1050302 


5.8 


530 


X75437 


T.maritirna pgK gene 
for 3- 

phosphoglycerate 
kinase 


0.20 


825648 


(234531) coproporphyrinogen 
oxidase [Homo sapiens] 


5.8 


531 


U32686 


Haemophilus . . 
influenzae Rd section 
1 of 163 of the 
complete genome 


0.20 


3309593 


(AF07287S) ciliary outer arm 
dynein beta heavy chain 


5,6 


532 


228081 


S.cerevisiae 
:hromosome XI 
reading frame ORF 
YKL081W 


0.20 


< 

2507201 


CARBON CATABOLITE 
DEREPRESS1NG PROTEIN 
KINASE >gi| 1469803 (L78129) 
serine/threonine kinase [Candida 
ilbicans] 


5.5 



2><0 



WO 01/02568 
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Nearest Neighbor (BlastN vs. Genbank) 



SEQ 

IP I ACCESSION 



DESCRIPTION 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



P VALUE | ACCESSION 



JHordeum vulgare 



DESCRIPTION 



P VALUEl 



limit dextrinase 
(HvLD99) gene, 
533 I AF022725 complete cds 



0.20 



3139154 



J(AF064077) adrenocorticotropic 
hormone receptor [Sus scrofal 



Drosophila 
melanogaster cosmid 
534 | AL021726 ll71E4 



0.20 



3885334 



(AC005623) putative argonaute 
protein rArabidopsis thaliana] 



Brassica rapa mRNA 
for SRK45, complete 
535 | ABO 12 1 06 Icds 



(292824) B0413.4 
0.20 | 4008334 [Caenorhabditis elegans] 



H.sapiens HLTF gene 
[for helicase-Iike 
536 I Z46606 transcription factor 



0.20 



132946 



60S RJBUSOMAL VRL) I hlN 
L30B (RP29) cytosolic - yeast 
(Saccharomyces cerevisiae) 
>gi|171821 not determined) 
HSaccharomyces cerevisiae] 
>gi| 1045254 cerevisiae] 
>gi| 1323250|gnl|PID|e243708 
(Z72933) ORFYGR148c 
[[Saccharomyces cerevisiae] 



1.5 
1.5 



H.sapiens mRNA for 
537 | JC87I93 [2.1 9 gene 



0.20 



139820 



DNA- REPAIR PROTEIN 
XRCC1 



538 



Clostridium 
perfringens C beta 2 
toxin gene, complete 
_L77965 cds 



0.20 



1 175950 



|H¥PU1HUUJAL JIdKD 
PROTEIN IN SEC53-ACT1 
INTERGENIC REGION 
>gi|I084703|pir||S562Il 
probable membrane protein 
YFL044c - yeast 
(Saccharomyces cerevisiae) 
>gi|8367ll|gnl|PID|dl009835 
KD50617) YFL044C 



Chicken neural cell 
adhesion molecule (N 
539 I M15938 ICAM) gene, exon 18 



0.20 



2133082 



'east 



Solanum tuberosum 
mRNA for extensin- 
540 J AJ0Q322Q I like protein, partial 



0.20 



2496932 



[regulato ry protein MSR1 - yi 
iHYWTftfflCAL Kd 
PROTEIN C56G2.I IN 
CHROMOSOME III 
>gi|726413 (U23177) C56G2.1 
gene product [Caenorhabditis 
elegans] 



541 | X98108 U.thalianapsbPgene 



0.20 



119227 



| EPIDERMAL GROWTH 
FACTOR PRECURSOR 
precursor - mouse >gi|3092l0 
(J00380) prepro-egf [Mus 

Imusculus] 



1.4 



l.l 



1.1 
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Nearest 


Neighbor (BlasuN vs. Genbank) 


_ Nearest Neighbor (BlastX vs. Non-Redundant Proteins^ 


SEQ 
ID 


ACCESS^ 


J DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















542 


AB011179 


Homo sapiens mRNA 
for KIAA0607 
protein, partial cds 


i 

0.20 


2143753 


gene VGP protein - rat 
>gi|205690 (M60525) nerve 
growth factor inducible protein 
[Rattus norvegicus] >gi|205701 
(M60522) nerve growth factor- 
inducible protein [Rattus 
norvegicus] >gi|207651 


0.39 


543 


X75318 


H.sapiens ITIHi gene 
(exon 22) and ITIH3 
gene 


0.20. 


629557 


RN A- binding protein mpD - 
Arabidopsis thaliana (fragment) 
>gi|5 10240 (X61108) RNA 
binding protein [Arabidopsis 
thaliana] 


0.38 


544 


AB008374 


Oncorhynchus mykiss 
mRNA for alpha 3 
type I collagen, 
partial cds 


0.20 


1082610 


muf 1 protein - human 
>gi|762953 (X86018) mufl 
[Homo sapiens] 


0.37 


545 


U09809 


Limulus polyphemus 
arginine kinase 
mRNA, complete cds. 


0.20 


3882016 


(AJ0I2650) CP [Papaya 
ringspot virus] 


0.37 




AB 020671 


Homo sapiens mRNA 
for KIAA0864 
protein, partial cds 


0.20 


2674350 


(U93121)M-phase 
phosphoprotein- 1 [Homo 
sapiens] 


0.1S 


547 


L04457 


rnycopnlnora 
megasperma 
mitochondrial 
ORF152, complete 
cds. cytochrome c 
oxidase subunit I 
(coxl) gene, 
complete cds, 
:ytochrome c oxidase 
subunit II 


0.20 


746516 


(U235 17) D 1022.7 
Caenorhabditis elegans] 
>gi|3258651 elegans] 


0.043 


54S 


J 
] 
i 
< 
c 
c 

( 

c 
c 

L04457 s 


rnytopntnora 
negasperma 
mitochondrial 
DRF152, complete 
*ds, cytochrome c 
)xidasc subunit 1 
coxl) gene, 
ompleie cds. 
ytochrome c oxidase 
ubunit II 


0.20 


7465 16 


U235I7)D1022.7 
Caenorhabditis elegans] 
►gi|325S651 elegans) 


0.042 
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Nearest Neighbor fBlastN vs. Genbank) 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



SEQ 

*P I ACCESSION 



DESCRIPTION 



Ldior=cyclin- 



P VALUE | ACCESSION 



DESCRIPTION 



P VALUE 



549 | S828I9 



dependent kinase : 
regulatory subunit 
p35 [mice, brain, 
129/SvJ, C57BLV6, 
Genomic/mRNA, 
5528 ntl 



(AB007923) KIAA0454 protein 
0.20 | 3413870 [Homo sapiens] 



0.020 



550 | D31792 



Streptomyces griseus 
DNA for 
serine/threonine 
protein kinases, 
complete cds 



0.20 



861405 



(U29154) T07F12.2 gene 
product [Caenorhabditis 
elegans] 



551 | U97499 



lomo sapiens 
butyrophilin (BT3.2) 
gene, exons 5-10, and 
complete cds 



0.20 



2773341 



(AF040954) putative protein 
phosphatase 1 nuclear targeting 
Isubunit [Rattus norvegicus] 



0.008 



552 1 U31463 



Rattus norvegicus 
nonmuscle myosin 
heavy chain-A 
mRNA. complete cds 



(Z81 130) predicted using 
0.20 | J880111 [Genefmder 



553 I X78401 



Bacteriophage P22 
right operon, orf 48, 
replication genes 18 
and 12. nin region 
genes, ninG 
phosphatase, late 
control gene 23, orf 
60, complete cds, late 
control region, start 
of lysis gene 13 



0.20 



1123087 



(U42436) C49H3.3 gene 
jproduct [Caenorhabditis 
[elegans] 



554 |X573io 



Nocardia 

lactamdurans pcbAB 
and pcbC genes for 
alpha-aminoadipyl-L 
cysteinyl-D- valine 
synthetase and 
isopenicillin N 
synthase 



0.20 



1723511 



PUTATIVE ENDONUCLEASE 
C1F12.06C yeast 
[(Schizosaccharornyces pom be) 
>gi|12179S0 (Z69944) unknown 
[Schizosaccharornyces pombe] 



4e-09 



1 555 



X62386 



S.epidermidis genes 
cpiY\ epiY, epiA, 
epiB.epiC, epiD, 
epiQ. epiP 



0.20 



3874927 



(Z73424) C44B9.1 
[Caenorhabditis elegans] 



3e-10 



WO 01/02568 
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Nearest Neighbor ( BlastN vs. Genbank) 



ID 1 ACCESSION DESCRIPTION 



Epizootic 

Ihaemorrhagic disease 
virus gene segment 6 
_X59000 forNSl 



557 | M98776 



Human keratin 1 
Igene* complete cds 



Nearest Neighbor (BlastX vs . Nnn-gpH.,nH,», p^>^~ 



P VALUE | ACCESSION 



0.20 



3879755 



0.20 



1086900 



DESCRIPTION 



U8UJ2UJ similar to nucleotide 



bmaing protein; cDN A KS! 
EMBL:M75897 comes from this 
gene; cDNA EST 
EMBL:M89054 comes from this 
gene; cDNA EST 
EMBL:D26713 comes from this 
gene; cDNA EST 
EMBL.D26718 comes from this 
gene; cDNA 



P VALUEl 



(U41278) contains similarity to 
G beta repeats 



8e-16 



Mus musculus 
Igranzyme K gene, 
AF0 1I446 complete cds 



559 | AF074708 



Macaca mulatta clone 
MMU1.5 FRGMikc 
[pseudogene, exons 7 
and 8. partial 
[sequence 



0.19 



<NONE> 



0.19 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 
<NONE> 



560 | XI 3287 



Medicago sativa 
|nodulin-25 gene 



0.19 



<NONE> 



<NONE> 



561 I Z49509 



IS.cerevisiae 
chromosome X 
reading frame ORF 
YJR009c 



562 1 D89041 



[Bovine DNA for 
[prostaglandin 
F2alpha receptor, 
partial cds 



0.19 



<NONE> 



0.19 



563 | D29644 



Streptococcus 
[salivarius DNA for 
Idextranase 



<NONE> 



0.19 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



564 | AE00146I 



Helicobacter pylori, 
strain J99 section 22 
of 132 of the 
complete genome 



0.19 



<NONE> 



<NONE> 



565 | L38559 



Homo sapiens 
galactocerebrosidase 
(GALC) gene, exon 
17. 



566 | Z8262S 



R.prowazekii 
genomic DNA 
fragment (clone 
A405F) 



0.19 



<NONE> 



0.19 



<NONE> 



<NONE> 



<NONE> 



WO 01/02568 
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Nearest Neighbor (BlastN vs. Genbank) 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



SEQ 

IP I ACCESSION 



DESCRIPTION 



P VALUE | ACCESSION 



DESCRIPTION 



P VALUE 



567 | U25641 



Telrahymena 
thermophila 
te lorn erase 
component p80 
mRNA, complete cds 



0.19 



<NONE> 



<NONE> 



<NONE> 



568J AB002343 



Human mRNA for 
KIAA0345 gene, 
complete cds 



569 I D 10064 



Erwinia carotovora 
gene for pectate lyase 
III, complete cds 



0.19 



<NONE> 



<NONE> 



<NONE> 



ai9 



<NONE> 



<NONE> 



<NONE> 



Homo i 



i clone 



570 | U31734 



sapiens 
MF118 A4AI0 
hypoxanthine 
phosphoribosyltransfe 
rase (hprt) 130 kb 
deletion mutant 
mRNA, partial cds, 
contains human Alu 
element 



0.19 



<NONE> 



<NONE> 



<NONE> 



571 | AE001386 



Plasmodium 
falciparum 
chromosome 2, 
section 23 of 73 of 
the complete 
sequ ence 



0.19 



<NONE> 



<NONE> 



<NONE> 



572 I M95623 



Homo sapiens 
hydroxymethylbilane 
synthase gene, 
complete cds. 



0.19 



<NONE> 



<NONE> 



<NONE> 



573 | S67478 



574 I X99075 



(GC*IS)=vitamin D- 
binding protein/group 
specific component 
human, peripheral 
blood leukocytes, 
Genomic, 794 nt, 
segment 4 of 9] 



0.19 



<NONE> 



H.sapiens NRGN 
eene. exon 1 



<NONE> 



<NONE> 



0.19 



<NONE> 



<NONE> 



<NONE> 



575 I AF044775 



Homo sapiens 
breakpoint cluster 
region BCRder 14 
sequence 



0.19 



<NONE> 



<NONE> 



<NONE> 



WO 01/02568 
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Nearest Neighbor (BlasiN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCF^STON 




D \/Af T TT 

r VALUE 






Human mRNA for 










576 


AB002333 


KIAA0335 gene, 
complete cds 


0.19 


<NONE> 


<NONE> 


<NONE> 


577 


U53566 


1/GHF-l 

transcription factor 
mRNA. complete cds 


0.19 


1078068 


probable membrane protein 
YLR311c- yeast 


9.2 


578 


U73664 


Human 

t(ll;14)(ql3;q32) 
breakpoint junction 
sequence 


0.19 


116734 


COAT PROTEIN (CAPS ID 
PROTEIN) virus >gi|58901 
(X62133) CyMV coat protein 
gene product 


8.8 


579 


f\Xr\J\J t *\JJ** 


Heteroohvllaea 

tlVlWI VWIMJ IlitWM 

pustulata rpsl6 gene, 
chloroplast gene, 
partial intron 
sequence 


n i o 

u.iy 


1928991 


(U92815) heat shock protein 70 
precursor [Citrullus lanatusl 


8.7 


580 


Z27081 


Caenorhabditis 
etegans cosmid 
MO 1A8. complete 
sequence 
Caenorhabditis 
elegans] 


0.19 


2496247 


HYKUTHtriCAL AIP- 
BINDING PROTEIN MJ0625 
>gi|2128413|pir||A64378 
hypothetical protein MJ0625 - 
Methanococcus jannaschii 
>gi| 159 1336 (U67510) M. 
jannaschii predicted coding 
reaion MJ0625 


8.6 


581 


Z74145 


S.cerevisiae 
chromosome IV 
reading frame ORF 
YDL097c 


0.19 


1174425 


TYROS IN E -PROTE IN 
KINASE SPK-1 


6.7 


582 


D38547 


Small round 
structured virus 
genomic RNA, 
3 'terminal sequence 
containing ORF2 and 
0RF3 


0.19 


971318 


[Z48053) putative protein 
Bovine herpesvirus 1] 


5.1 



V73 



WO 01/02568 
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Nearest Neighbor (BlasiN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant ProreJn^ 


SEQ 
ID 


ACCESSION 


r DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






ivuiMVJMjd cuuUUiiu 










583 


D88000 


OWA tos riposomal 

DMA ^ . . 

RMA > :: 

dbj|D88002|D88002 
Ralstonia eutropha 
DNA for I6S 
ribosomal RNA > :: 
dbj|D88003|D88003 
Ralstonia eutropha 
DNA for 16S 
ribosomal RNA > :: 
dbj|D88004|D88004 
Ralstonia eutropha 
DNA for 16S 
ribosomal RNA 


0.19 


3800952 


(AF100657) No definition line 
found [Caenorhabditis eleeans] 


5.1 


584 


U67462 


Methanococcus 
jannaschii section 4 
of 150 of the 
complete genome 


0,19 


3183617 


(AJ005586) MYB-related 
transcription factor 
'Antirrhinum majus] 


4.0 


585 


L23906 


Gallus domesticus 
microsatellite DNA 
marker. 


0.19 


1947094 


(U93074) voltage-gated sodium 
channel homolog BdNal 


3.9 


586 


AE001462 


Helicobacter pylori, 
strain J99 section 23 
of 132 of the 
complete genome 


0.19 


1730177 


GLUCOSE-6-PHOSPHATE 
ISOMER ASE (GPI) 
ISOMERASE) (PHI) 
>gi|2118333|pir||I4S073 glucose 
phosphate isomerase - Chinese 
hamster >ai|987046 sriseus] 


3.9 


587 


i 

Ml 9460 ( 


P.putida catBC 
operon encoding 
cis,cis-muconate 
actonizing enzyme I 
md muconolactone 
somerase genes, 
;omplete cds. 


0.19 


] 

1 
< 

3873843 1 


iz.fl/ooj CLiNA hi>I 
yk251g7.3 comes from this 
gene;cDNA EST yk251g7.5 
comes from this gene; cDNA 
EST EMBL:D68223 comes 
from this gene; cDNA EST 
EMBL:C 12737 comes from this, 
gene; cDNA EST yk389cS.5 
:omcs from this gene; cDNA 


3.9 


588 


t 

U22349 |s 


retrahymena australis 
elomerase RNA 
jene. complete 
equence 


0.19 


C 

4105782 [ 


AF049922) PGP169-12 
Petunia x hybrida] 


3.2 



WO 01/02568 
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1 Neare< 

seq! 

k> iaccessio 


t Neighbor (BlastN vs. Cenbank) I Nearest NcTf 

n| description |p value! ACCESSION 


lbprfBlastX vs. Non-Redundant ! 
DESCRIPTION 


tote ins) 
P VALUE 


589 J L27745 


Homo sapiens voltage! 
operated calcium 
channel, alpha- 1 1 
subunit mRNA, 1 
complete cds. | 0.19 


3763926 


(AC004450) unknown protein 
[Arabidopsis thaliana] 


3.0 1 


590 I AF049588 


Canis familiaris 1 
synapsin I gene, I 
partial cds 1 0 19 


4104931 


(AF042196) auxin response 
_ factor 8 [Arabidopsis thalianal 


3.0 I 


591 J X06627 

592 1 X61597 


Staphylococcus 
aureus pIasmidpS194 

sequence J 0. 19 

M.mus cuius gene for 

kallikrein-binding 

protein | o 19 


137927 
2982874 


FKh-NELK APPENDAGE 

PROTEIN (LATE PROTEIN 
GPI2) >gi|75856|pir||WMBP22 
gene 12 protein - phage phi -29 
>gi|2 15330 (M14782) pre-neck 
appendage protein 
[Bacteriophage phi-29] 
>gip25367|prf]|1301270G gene 
12 [Bacteriophage phi-29] 

(AE000675) cobalamin 
synthesis related protein CobW 


2.3 

1.7 I 


593 J AFO 16242 


Dictyostelium 1 

discoideum protein 1 

synthesis elongation 

factor 1 -alpha (tef2) 

gene, partial cds 0.19 


133659 


PUTATIVE RNA-DERECTED 
RNA POLYMERASE 


1.4 I 


1 
1 

1 I 

1 i 

1 i 
1 \ 

1 F 

594 AF004447 p 


* ^nviuuuu equine 
encephalitis virus 
strain 1327 
polyprotein gene, 
martial cds > :: 
»b|AF004460|AF004 
*60 Venezuelan 
;quine encephalitis 
'irus strain 1385 
>o!yprotein gene, 
artial cds 


0.19 


( 

4096173 ^ 


U25968) early embryogenesis 
rotein [Oryza sativa] 


1.3 I 


I F 

1 g 

1 11 
595 J04821 6 


luman elastin (ELN) 
ene, exon l ? clones 
ELC-5 and HELC- 


0.19 


11 
P 

b 

1170523 n 


NH1BIN BETA B CHAIN 
RECURSOR inhibin precursor 
bovine >gi|563753 (UI6241) 
5taB inhibin/activin precursor 
Jos taurus] 


13 


H 

1 d 

596 1 AF059650 cr 


omo sapiens histone 
iacetylase 3 
1DAC3) gene, 
>mplete cds 


0.19 


P 
P 

>\ 

3024881 (7 


ROB ABLE TRANSPORT 
ROTEIN CY21C12.U 
>il2078O66|gnI|PID|e315171 
95210) betP 


0.S3 | 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 



Nearest Neiehbor (BlastN vs. Gcnbank) 



ACCESSION 



DESCRIPTION 



P VALUE 



_Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



DESCRIPTION 



FERKEDDXIN^DEPENDENT 



P value! 



597 



598 



599 



M69053 



AF076279 



D28373 



D.melanogaster 
calcium-activated K+ 
channel subunit 



0.19 



Diccyostelium 
firraibasis plasmid 
Dfpl, complete 
plasmid sequence 



1707984 



^LUIAMAIHStNIHASEI 
(FD-GOGAT) 
>gi|2126524|pir||S60228 
gJutamate synthase (ferredoxin) 
(EC 1.4.7.1) gltB- 
Synechocystis sp. (PCC 6803) 
>gi|515938 (X80485) glutamate 
synthase 



0.80 



0.19. 



453986 



(U00008) yejA [Escherichia 
colij 



0.79 



Mouse MCNP gene 
for C-type natriuretic 
peptide, complete cds 
(exonl, exon2) 



0.19 



2650444 



(AE001092) acetyl-CoA 
synthetase (acs-1) 
Archaeoslobus fuleidusl 



600 



U06071 



Oxytricha nova 
macro nuclear actin II 
gene, complete cds, 



0.19 



1584024 



complement control protein 
Botryllus schlosseri] 



601 



L54057 



Homo sapiens CLP 
mRNA, partial cds. 



0.19 



3036883 



(AL022374) putative ABC 
transporter 



602 



603 



X89806 



P.lividius cDNA for 
COLL2alpha gene 



0.19 



AE001104 



Archaeoglobus 
fulgidus section 3 of 
172 of the complete 
genome 



3638957 



(AC004877) sco-spondin-muc in- 
like; similar to P98167 uncertain 
Homo sapiens] 



0.41 



0.19 



2315192 



(Yl 1739) transcription factor 
Homo sapiens] 



604 



605 



606 



607 



U54501 



X74468 



U20285 



D49408 



Rattus norvegicus 
microsatellite 
sequence D0Mco22 



0.19 



228951 



D-MeAsp 
rcceptor:ISOTYPE= 
Mus musculus] 



;psilon3 



0.32 



Human 

papillomavirus type 
15 genomic DNA 



0.19 



3695390 



Human GpsI (GPS I) 
mRNA, complete cds 



Human gene for 
interleukin 3 receptor 
alpha subunit, exon 
10 



0.19 



2582659 



0.19 



252236S 



(AF096371) contains similarity 
to Rattus norvegicus cyclin G- 
associated kinase (SW;P97874) 
[Arabidopsis thaliana] 



0.28 



(AJ002527) glucitol-6- 
phosphate dehydrogenase 
[Clostridium beijerinckii) 



0.27 



( AF00S596) alpha 1,3- 
fucosyltransferase [Helicobacter 
pylori] . 



0.16 



WO 01/02568 



PCT7US00/18374 





Nearest 


Neighbor (BlastN vs. Genbank) 


Nearest Neighbor f BlasrX vs. Non-ReHunHnn r ProtrinO 


SEC 
ID 


ACCESSION 


* DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 


608 


AF041141 


Homo sapiens 
pituitary specific 
homeodomain proteir 
(PROPH eene exon 
3 and complete cds 


i 

0.19 


37403 


(X03541) trk gene product (aa 1 
641) [Homo sapiens] 


0.091 


609 


L12531 


Discopyge ommata 
Ca2+ channel alpha 1 
subunit gene 
sequence. 


0.19 


3618274 


(AJ223219) hypothetical protein 


0.069 


610 


AF052445 


Yellow fever virus 
clone HONG9 
poiyprotein gene, 
complete cds 


0.19 


1932822 


(U 1 5928) KH-domain putative 
RNA binding protein 


0.001 


611 


Z36946 


B.amhracis sap gene 
encoding S- layer 
protein 


0.19 


173241 


(L06487) ZIP1 protein 
[Saccharomyces cerevisiae] 


2e-04 


612 


AF087984 


Homo sapiens full 
length insert cDNA 
clone YW29A12 


0.19 


3786014 


(AC005499) hypothetical 
protein [Arabidopsis thaliana] 


le-06 


613 


AE00I010 


Archaeoglobus 
fulgidus section 97 of 

172 df the cnmnlctp 
genome 


0.19 


3135493 


(Ar060248) unknown 
[Arabidopsis thaliana] 


7e-08 


614 


L08965 


Trichosporon 
cutaneum carbamoyl 
phosphate synthetase 
large subunit (argA) 
zene. Dartial cds 


0 10 


lUoOVUl 


(U41278) F33G12.3 gene 
product [Caenorhabditis 
elegans] 


2e-08 


615 


M9I466 


Rattus norvegicus 
A2b-adenosine 
receptor mRNA, 
complete cds. • 


0.19 


2984320 


[AE000773) acetoin utilization 
Drotein f Aauifex aeolicu^l 




616 


< 

X95971 £ 


s.lividans groEL2 
lene 


0.19 


] 

s 
c 

3925277 


[alU3^64j) similar to 
Uncharacterized protein family 
UPF0034, Double-stranded 
WA binding motif; cDNA EST 
/k489b3.5 comes from this 
zene; cDNA EST yk439g7.5 
;omes from this gene 
Caenorhabditis elegans] 


7e-10 


617 


F 

U12539 p 


Ichizosaccharomyces 
ombe scd2 (scd2) 
ene, complete cds. 


0.19 


I 

F 

193S549 ( 


U97016) similar to drosophila 
^Icl gene product ribosomal 
>roteinL4 (YML4) 
NID:g459259) 


3e-l4 
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Nearest Neighbor (BlasiN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins 1 * 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












(U97016) similar to drosophila 




618 


U12539 


Schizosaccharomyces 
pombe scd2 (scd2) 
gene, complete cds. 


0.19 


1938549 . 


Rlcl gene product ribosornal 
protein L4 (YML4) 
(NID:g459259) 


9e-l5 


619 


Z68327 


Human DNA 
sequence from 
cosmid U25DU, 
between markers 
DXS366 and DXSS7 
on chromosome X. 


0.19 


3875774 


EMBL:D32434 comes from this 
gene; cDNA EST 
EMBL.D33710 comes from this 
gene; cDNA EST 
EMBL:D34467 comes from this 
gene; cDNA EST 
EMBL;D350O5 comes from this 
gene; cDNA EST 
EMBL:D37535 comes from this 
gene; ... 

>gi|3 8787 10|gnl|PXD|e 1 348373 
EST EMBL:D33710 comes 
from this gene; cDNA EST 
EMBL:D34467 comes from this 
gene; cDNA EST 
EMBL:D35005 comes from this 
gene; cDNA EST 
EMBL:D37535 comes from this 
gene; ... 


6e-15 


620 


U66525 


Dictyostelium 
discoideum 
ORFvegll4 mRNA, 
complete cds 


0.19 


3540281 


(AF056116) AIM related 
protein [Fugu rubripes] 


2c- 17 


621 


U25830 


Newcastle disease 
virus isolate Herts/33 
matrix protein 
mRNA, complete cds 


0.19 


2228750 


(U93868) RNA polymerase III 
subunit [Homo sapiens] 


le-18 


622 


U89407 


Mus musculus strain 
B ALB/c delta- 
aminolevulinic acid 
dehydratase (Lv) 
mRNA, partial cds 


0.19 


1825764 


(U88314) C46H1M1 gene 
product [Caenorhabditis 
degansl 


3e-25 


623 


AF095598 i 


3ison bison 
ithabascae 
nicrosatellite BBJ 2 


0.18 


<NONE> 


<NONE> 


<NONE> 


624 


< 
1 

AF064260 r 


Strongylocentrotus 
Durpuraius SRC8 
tiRNA, complete cds 


0.18 


<NONE> 


<NONE> 


<NONE> 



yi4 
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Nearest Neighbor (BlastN vs. Genbank) 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins! 



SEQ 
ID 



ACCESSION! DESCRIPTION 



625 | U69533 



626 1 D89Q4I 



P VALUE 



ACCESSION 



DESCRIPTION 



Arabidopsis thaliana 
AtKAP alpha mRNA, 
complete cds I 0.18 



Bovine DNA for 
prostaglandin 
F2alpha receptor, 
partial cds 



0.18 



<NONE> 



<NONEi 



P VALUE 



<NONE> 



|<NONE> 



<NONE> 



<NONE> 



627 | M24571 



Dictyostelium 
discoideum tRNA- 
GIu-GAA gene, clone I 
yGluGAA7. j 0.18 



628 1 X59772 



D.melanogaster ovo 
gene required for 
female germ line 
development | 0.18 



<NONE> 



<NONE> 



629 1 AL010209 



falciparum DNA ' 
SEQUENCING IN 
PROGRESS *** 
from contig 3-104, 
complete sequence 



0.18 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



630 | U67575 



Methanococcus 
jannaschii section 117| 
W 150 of the 

complete genome | 0.18 



111839 



inositol 1,4 ^-triphosphate 
receptor 2 - rat 



631 I U28730 



Caenorhabditis 
elegans cosmid 
K10B2 



0.18 



1787604 



(AE000232) orf, hypothetical 
protein (Escherichia coli] 



8,3 



632 I X99798 



L.lactis pepFl & 
pepF2 genes 



0.18 



3406624 



(AF079I10) glycosomal malate 
dehydrogenase [Trypanosoma 
brucei] 



8.1 



633 | AF025306 



Danio rerio band 4. 1- 
like protein 4 (nb!4) 
mRNA. complete cds I 



0.18 



465445 



634 I AF059251 



Mus musculus 
lipoxygenase (alox) 
mRNA, complete cds 1 



0.18 



1655667 



PROBABLE NUCLEAR . 
ANTIGEN herpesvirus 1 (strain | 
Kaplan) >gi|334072 (M3465 1) 
ORF-3 protein [Pseudorabies 
virus] 



7.9 



(Z81368) hypothetical protein 
Rv2393 



6.6 



635 | 222605 



G.domesticus CTCF 
protein mRNA. 



0.18 



481864 



3-methy]-2-oxobutanoate 
dehydrogenase 



6.6 



636 | AB011QS6 



Homo sapiens mRNAj 
forKIAA05l4 
protein, complete cds | 



0.18 



3874158 



(28 1464) predicted using 
Genefinder 



6.4 



311 
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Nearest Neighbor (BlastN vs. Genbank) 



SEQ 

IP 1 ACCESSION 



637 



638 



639 



278536 



U67530 



M63781 



DESCRIPTION 



Nearest Neighbor (BlascX vs. Non-Redundant Proteins) ' 



P VALUE 



L'aenorhabditis 



ACCESSION 



elegans cosmid 
C07A4, complete 
sequence 
[Caenorhabditis 
elegans] 



Methanococcus 
jannaschii section 72 
of 150 of the 
complete genome 



Influenza 
A/Duck/England/1/62 
(H4N6) nucleoprotein 
mRNA, complete cds. 



DESCRIPTION 



0.18 



3702121 



0.18. 



3877946 



P VALUE 



(AJ01 1681) retinobiastoma- 
related protein [Chenopodium 
rubrurn] 



Weak similarity to 55 
KDA heat shock protein 
(TR:G602231);cDNA EST 
EMBL:D71705 comes from this 
gene; cDNA EST 
EMBL:D74382 comes from this 
gene [Caenorhabditis elegans] 



0.18 



3873663 



TZ6ybj4JcUNAbM 
EMBL:D715I0 comes from this 
gene; cDNA EST 
EMBL:C08449 comes from this 
gene; cDNA EST yk266bl2.3 
comes from this gene; cDNA 
EST yk266bl2.5 comes from 
this gene; cDNA EST 
yk461h7.3 comes from this 
gene; cDNA... 



6.4 



6.3 



640 



M73781 



Oryctoiagus 
cuniculus integrin 
beta- 8 subunit 
mRNA, complete cds. 
> :: gb|I44828|I44828 
Sequence 3 from 
patent US 5635601 



0.18 



1362129 



major allergen OLE 1 7 ■ 
common olive 



641 



X67219 



D.melanogaster Rop 
gene ' . ■ 



0.18 



3449286 



(AB0U527) MEGF1 [Rattus 
norvegicus] 



4.8 



642 



API 06941 



Homo sapiens beta- 
arrestin 2 mRNA, 
complete cds 



0.18 



548353 



PROTElN-PIf] 
URIDYL YLTRANSFERASE 
vinelandii >gi|39257 (X59610) 
uridylyl transferase 



643 



AF052602 



Danio rerio 
huntingtin (HD) 
mRNA, complete cds 



0.18 



241058 



potential IGF binding protein 
[chickens, Peptide Partial, 77 aa, 
segment 2 of 3] 



3.6 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 



Nearest Neighbor (BlastN vs. Genhanl^ 



ACCESSION 



DESCRIPTION 



644 I AB020709 



645 I AFQ96883 



646 I L39928 



647 I MI 7082 



648 1 X753 18 



649 1 AF0I19Q8 



650 J UQ4QQ4 

651 I U88155 



Homo sapiens mRNA 
for KIAA0902 
protein, complete cds 



HIV- 1 isolate patient 

country USA pol 
polyprotein (pol) 
gene, partial cds 



Pyrocoelia miyako 
(clone pB-PmL41) 
lucifcrase mRNA, 
complete cds 



P VALUE 



Nearest Neigh bor (BlastX vs. Non-Redundant Pmrpin« r 



ACCESSION 



0.18 



DESCRIPTION 
»Ji4) predicted using 



3875570 



oenennder; cUNA hST 
EMBL:M75775 comes from this 
gene; cDNA EST 
EMBL:M89255 comes from this 
[gene; cDNA EST 
EMBL:M89127 comes from this 
Igene; cDNA EST 
EMBL:T0014I comes from this 
gene; cDNA EST EMBL.T. 



0.18 



3250696 (AL024486) putative protein 



Human 

carcinoembryonic 
nonspecific 
crossreacting antigen 
(CEA; NCA) gene, 
exons 1 and 2. 



H.sapiens ITIHJ gene 
(exon 22) and ITIH3 
gene 



Mus musculus 
apoptosis associated 
tyrosine kinase 
(AATYK) mRNA, 
complete cds 



Simian 

immunodeficiency 
virus SIVagmVER-2 

velope protein 
ene, partial cds . 
Xenopus laevis 
RanGTPase 
activating protein 



0.18 



KAC003974) unknown protein 
2914702 [Arabidopsis thaliana! 



2.1 



0.18 



0.18 



1351833 



629557 



REGULATORY PROTEIN 

ABAA 

IRNA-binding protein rnpD 
Arabidopsis thaliana (fragment) 
>gi|5 10240 (X61108) RNA 
binding protein [Arabidopsis 
[thaliana] 



0.18 



330442 



[(K03332) nuclear antigen 2 
[Epstein-Barr virus] 



0.18 



0.18 



135102 



995714 



TASFAK1 YL-TRNA 

SYNTHETASE aspartate-- 
tRNA ligase (EC 6.1.1.12) - 
Escherichia coli coii] 
>gi|l7365i3|gnl|PID|dl016401 
(D90S29) Aspartate-LRNA 
ligase (EC 6.1.1.12) 
[Escherichia coli] 

(X91258) pid:e 198503 
IfSaccharomyces cerevisiae] 



0.72 



0.41 



5c-04 



6e-ll 



2c- 13 



WO 01/02568 
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Nearest 


Neighbor fBlastN vs. Genbank) | 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins! 


SEC 
ED 


) 

ACCESSION 


4 DESCRIPTION 


P VALUE 1 


ACCESSION 


DESCRIPTION 


P VALUE 


652 


Z 18921 


B.oleracea gene for S 
receptor kinase- like 
protein 


0.18 J 


3875535 


1 1 ) similar to nookinase; 
cuwa tsi" LMHLD69553 " 
comes from this gene; cDNA 
EST EMBL:D65938 comes 
from this gene; cDNA EST 
• yk280h9.3 comes from this 
gene; cDNA EST yk280h9.5 
comes from this gene; cDNA 
EST yk223d!1.3 come... 


le-19 


653 


M6O650 


o .cere visiae o i Ax 
gene, complete cds. 


0.16 J 


<NONE> 


<NONE> 


<NONE> 


654 


U80912 


Eucalyptus globulus 

NADP-isocitrate 

dehydrogenase 

(T^o\CT\\X\ mDMA 

^cgiUJJrtj mKINA, 
complete cds 


0.16 1 


3766172 


(AF057298) ornithine 
decarboxylase antizyme 2 [Mus 
musculus] 


4.2 


655 


AF012899 


Sambucus nigra 

iiuudUiiic inJLiivaiinH 

protein precursor 
mRNA, complete cds 


0.16 1 


76749 


hypothetical protein 4 - fowl 
adenovirus 1 


4.0 


656 


AF027174 


Arabidopsis thaliana 
cellulose synthase 
Laiaiyiic suounit ^Atn- 
B) mRNA. complete 
cds 


0.16 J 


3044086 


(AF055904) unknown 
Myxococcus xanthus] 


0.60 


657 


AF03O231 


Glycine max sucrose 
synthase (SS) mRNA, 
complete cds 


0.078 1 


<NONE> 


<NONE> 


<NONE> 


658 


MI9183 c 


Woodchuck hepatitis 
v irus (WHV), 
;omplete genome, 
:lone WHV 59. 


0.072 1 


( 
I 

1076190 1 


cell wall elvcnnrnrpin 1SK 
precursor • diatom 
Cylindrotheca fusiformis) 
>gi|5 15363 (X80394) P75K 
>ene product [Cylindrotheca 
usiformis] 


6.3 


659 


I 
I 

£ 
a 
F 

U31557 c 


Jvine adenovirus 
Va2 protein gene, 
}NA polymerase 
;ene. terminal protein 
enc and 52.55 kDa 
rotein gene, partial 
ds 


0.072 1 


( 

3511143 \ 


AF061244) unknown 
Agrocybe aegerita] 


6.2 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non- Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Caenorhabditis 










660 


AL021491 


elegans cosmid 
Y44A6B, complete 
sequence 
[Caenorhabditis 
elegans] 


0.070 


<NONE> 


<NONE> 


<NONE> 


661 


M33874 


X.Iaevis Xotch 
protein mRNA, 
complete cds. 


0.070 


1654096 


(Y09076) RAD3 
[Schizosaccharomyces pombe] 


0.23 


662 


AB012725 


Mus muse ul us 
ZAN75 mRNA for 
zinc finger protein, 
complete cds 


0.069 


1350800 


MITOCHONDRIAL 
RIBOSOMAL PROTEIN S5 


2.0 


663 


AL021491 


Caenorhabditis 
elegans cosmid 
Y44A6B, complete 
sequence 
[Caenorhabditis 
elegans] 


0.068 


<NONE> 


<NONE> 


<NONE> 


664 


Z60318 


H. sapiens CpG DNA, 
clone lei, reverse 
read cpglel.rla . 


0.068 


1280134 


(U55376)F16H'11.2 gene 
product [Caenorhabditis 
eleijans] 


2.6 


665 


Z35973 


S.cerevisiae 
chromosome II 
reading frame ORF 
YBR104w 


0.068 


2493000 


PKtJBABLb iULLUM 1 L- 
COA:3-KETOACID- 
COENZYME A 

TRANSFERASE PRECURSOR 
EMBL:Z14S16 comes from this 
gene; cDNA EST 
EMBL:Z 14946 comes from this 
gene; cDNA EST 
EMBL:D69746 comes from this 
gene; cDNA EST yk2 19663 
comes from this gene; cDNA 
ES... 


0.6S 


666 


Z86111 


Streptomyces lividans 
rpsP, trmD, rpIS, 
sipW, sipX, sipY, 
sipZ, mutT genes and 
4 open reading 
frames 


0.068 


1235974 


(X96713) collagen [Globodera 
pallida] 


4e-04 


667 


M729S0 


Anthonomus grandis 
vitellogenin gene 
(VTG), complete cds. 


0.06S 


3242750 


(AC005164) match to ESTs 
AA731I49 (NID:g2140138), 
AA73190S (NID:g2752719), 
AA287837 (NID:gl9335l9), 
AA262811 (NID:glS9S3S2), 
and AA825S20 (NID:g2S99l32) 


le-59 



7ft 2> 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSIC* 


I DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















668 


M34161 


Rat tachykinin (PPT) 
gene, exons 5 and 6. 


0.067 


<NONE> 


<NONE> 


<NONE> 


669 


L0381I 


Aspergillus niger zinc 
finger protein (ere A) 
gene, complete cds. 


0.067 


<NONE> 


<NONE> 


<NONE> 


670 


M64983 


Human fibrinogen 
beta chain gene, 
complete mRNA. > 
gb|I47706|I47706 
Sequence 3 from 
patent US 5639940 


0.067 


<NONE> 


<NONE> 


<NONE> 


671 


AFO 14051 


Nicotiana tabacum 
Mg chelatase subunit 
(ChlH) mRNA, 
partial cds 


0.067 


<NONE> 


<NONE> 


<NONE> 


672 


Y07540 


H.sapiens sil gene 


0.067 


92331 


glycoprotein GP330, renal - rat 
(fraements) 


7.5 


673 


AJ000347 


Rattus norvegicus 
mRNA for y(2'\5'- 
bisphosphate 
nucleotidase 


0.067 


129238 


kD OOKINETE SURFACE 
ANTIGEN PRECURSOR 
(PRS25) >gi|320962|pir|| A44966 
25k ookinete surface antigen 
precursor - Plasmodium 
reichenowi reichenowi] 


7.4 


674 


LI9979 


Squid sodium channel 
mRNA. complete cds. 


0.067 


2128473 


hypothetical protein MJ0750 - 
Methanococcus jannaschii 
>gi| 1592304 (U67521) 
ferredoxin-tvpe protein 


1.5 


675 


X08050 


Yeast tRNA-G!u(3) 
gene and flanking 
regions 


0.067 


1334398 


(X15081) MURF2 protein (AA 
1-348) 


0.65 


676 


X17115 


Flu man mRNA for 
[gM heavy chain 
:omplete sequence 


0.067 


1731331 


HYPOTHETICAL 5 1.6 Kd 

DD/"YT'Cr\T /""V/IO 1 AO 

>gi|1370241|gnI|PID|e247089 
[Z73966) hypothetical protein 
Rv2075c [Mycobacterium 
uberculosis] 


0.51 


[ 677 


. 1 
i 
( 

AF032871 i 


rtomo sapiens 
ancoupling protein 3 
UCP3) gene, exon 1 
tnd partial exon 2 


0.067 


1 

i 
s 

2 

112900 f 


\LPHA-2C-1 ADRENERGIC 
RECEPTOR human >gi|178194 
J03853) kidney alpha-2- 
idrenergic receptor [Homo 
apiens] >gi| 1628638 (U72648) 
iIpha2-C4-adrenergic receptor 
Homo sapiens] 


0.50 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


Nearest N 
ACCESSION 


eiahbor (BlastN vs. Ge 
DESCRIPTION 


nbank) 
P VALUE 


Nearest Neiahbo 
ACCESSION 


r (BlastX vs. Non-Redundant Pro 

DESCRIPTION 
DYNAMIN 3 (DYNAM1JN, 


teins) 

P VALUE 


678 


X05319 


Mouse class II MHC 
E-beta2(d) gene 
exon 3 


0.067 


585074 


rESTlCULAR) rat 
>gi|39l872|gnl|PID|dl0O3668 
(D 14076) testicular dynamin 
rRattus norve^icusl 


3e-04 


679 


AB006362 


Candida albicans 
CaSLNl gene, 
complete cds 


0.067 


3417296 


(AC003007) Unknown gene 
product (partial) [Homo sapiensl 


ye- 30 


680 


AF021236 


African horse 
sickness virus capsid 
VP3 (L3) mRNA, 
complete cds 


0.066 


<NONE> 


<NONE> 


<NONE> 


681 


AE001507 


Helicobacter pyIori t 
strain J99 section 68 
of 132 of the 
complete genome 


0.066 


<NONE> 


<NONE> 


<NONE> 


682 


AF039717 


Caenorhabditis 
elegans cosmid 
R13H8 


0.066 


<NONE> 


<NONE> 


<NONE> 


683 


AF029027 


Syncerus caffer 
isolate Queen 
Elizabeth Mweya 14 
mitochondrial DNA 
control region 


0.066 


<NONE> 


<NONE> 


<NONE> 


684 


AF087967 


Homo sapiens full 
length insert cDNA 
clone YU51G05 


0.066 


2982476 


(X97203) CI protein [Beet curly 
top virus] 


9.5 


685 


J02037 


Baboon endogenous 
virus pro viral long 
terminal repeat DNA. 


0.066 


972767 


(L37868) POU-domain 
transcription factor [Homo 
sapiens] 


7.3 


686 


AF000141 


Lycopersicon 
esculentum class I 
knotted-like 
homeodomain protein 
(LeT6) mRNA, 
complete cds 


0.066 


3157926 


(AC0021.31) Strong similarity to 
extensin-Iike protein gb|Z34465 
from Zea mays. [Arabidopsis 
thaliana] 


5.6 


687 


AB001746 


Bensingtoma sp. 
OK255 gene for 18S 
rRNA > :: 

dbj|AB001747|AB00 
1747 Bensingtonia 
sp. OK259 gene for 
18S rRNA 


0.066 


3859889 


(AF070064) cap V collar 
isot'orm C [Drosophila 
melanogaster] 


0.3S 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Helicobacter pylori, 










688 


AE001461 


ctriirt TOO cAPtinn 00 

ofl32of the 
complete genome 


0.065 


<NONE> 


<NONE> 


<NONE> 


689 


M30821 


i^niLKcn cryinruiu 
transport proteins cl 
and c2 


0.065 


<NONE> 


<NONE> 


<NONE> 


690 


AB009802 


Homo sapiens gene 
for osteonidogen, 
intron 3 


0.065 


<NONE> 


<NONE> 


<NONE> 


691 


AF086062 


Homo sapiens full 
length insert cDNA 
clone YZ06B 1 1 


0.065 


<NONE> 


<NONE> 


<NONE> 


692 


AB002369 


Human mRNA for 
KIAA0371 gene, 
complete cds 


0.065 


2500884 


SIGNAL SEQUENCE 
BINDING PROTEIN binding 
protein [Synechococcus sp.] 


5.5 


693 


AF086864 


Cyclopodia sp. large 
subunit ribosomal 
RNA gene, 
mitochondrial gene 
for mitochondrial 
RNAs. partial 
sequence > :: 
gb|AF086866|AF086 
866 Penicillidia sp. 
large subunit 
ribosomal RNA gene, 
mitochondrial gene 

fpir miffV'hnnHrinl 
tUl 1I1UUCIIUI1U1 1UI 

RNAs, partial 
sequence 


0.065 


3721684 


(AB012957) probable glycosyl 
transferase [Vibrio cholerae] 


5.5 


694 


L44593 


Bacteriophage BK5-T 
ORF410, 3' end pf 
cds, 20 ORFs. 
repressor protein, and 
Cro repressor protein 
genes, complete cds, 
ORF70' gene, 5' end 
of cds. 


0.065 


1172067 


PEPTIDASE T 
(AMINOTREPEPTIDASE) 
influenzae Rd] 


3.2 


695 


US0079 


Ciona intestinalis 
MyoD-family protein 
(CiMDFa) mRNA, 
complete cds 


0.065 


4218110 


(AL035353) contains EST 
j2b:F152SI 


2.5 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Pmreino 


SEQ 
ID 


ACCESS IOIS 


f DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















696 


AB020718 


Homo sapiens mRNA 
for KIAA09I1 
protein, complete cds 


0.065 


1722734 


MINOR CAPSID PROTEIN L2 
>gi|1020l92type23] 


1.9 


697 


AF082137 


Zea mays copia-like 
retrotransposon Stl- 
14 leader region, 
partial sequence 


0.065 


1877501 


(U89278) polyhomeolic 2 
homo log [Homo sapiens] 


1.1 


698 


X64053 


R.norvegicus ZnBP 
gene for zinc binding 
protein 


0.065 


464963 


TRYPSIN PRECURSOR 


0.36 


699 


U67065 


Mus musculus 
butyrophilin (BTN) 
gene, promoter region 
and complete cds 


0.065 


2132252 


hypothetical protein YPL263c - 
yeast 


3e-l0 


700 


M64862 


Rat matrin F/G 
mRNA, complete cds. 


0.065 


3420183 


(AF041105) organic anion 
transporter protein 3 [Rattus 
norvegicus] 


4e-19 


701 


K02205 


Yeast (S.cerevisiae) 
transcriptional 
activator of amino 
acid-biosynthetic 
genes (GCN4) gene, 
complete cds. 


0.064 


<NONE> 


<NONE> 


<NONE> 


702 


X58282 


Maize mRNA for a 
high mobility group 
protein 


0.064 


<NONE> 


<NONE> 


<NONE> 


703 


AC001545 


Homo sapiens 
(subclone l_f3 from 
PI H69) DNA 
sequence 


0.064 


<NONE> 


<NONE> 


<NONE> 


704 


AF023461 


Homo sapiens 
FRA3B region 
sequence 


0.064 


<NONE> 


<NONE> 


<NONE> 




U5O307 


Caenorhabditis 
elegans cosmid 
F43H9. 


0.064 


<NONE> 


<NONE> 


<NONE> 


706 


U46542 < 


Streptococcus crista 
HmpA gene, partial 
cds, putative 
adhesin/ABC 
transport system 
protein (scbA) gene, 
:omplete cds 


0.064 


( 

1209391 | 


(D83659) TPR protein pombe] 
>gi|2894282|gnl|PID|e 125 1103 
AL02I838) pre-mrna splicing 
'actor. [Schizosaccharomyces 
x>mbe] 


9.2 


707 


i 

X57564 


A.rusticana mRNA 
"or neutral peroxidase 


0.064 


( 

1492037 c 


U60315) MC094R (Molluscum 
rontagiosum virus subtype I) 


6.9 
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Nearest 


Neighbor (BlastN vs. Genbank) 


Nearest Neighbor fBlastX vs. Non-Redundant PrntPinc) 


SEQ 
ID 


ACCESSIC* 


I DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


|p VALUE 






Human alpha-2- 










708 


U06986 


macrogiobulin 
receptor/lipoprotein 
receptor protein 
(A2MR/LRP) gene, 
exons 39-4 1 . 


0.064 


100800 


rab!5B protein - wheat 
>gi|21853 (X62476) rab protein 
[Triticum aestivum] 


5.3 


709 


D85773 


Human CpG island 
sequence, clone 
Q28B8 


0.064 


2245382 


(U88325) suppressor of 
cytokine signalling- 1 [Mus 
musculus] 


5.3 


710 


L06178 


Apis mellifera 
ligustica complete 
mitochondrial 
genome 


0.064 


3695379 


(AbUy63 IK)) contains similarity 
to a C. elegans hypothetical 
protein F44G4.1 (GB:249910) 
and several yeast hypothetical 
proteins such as 35. 1 KD 
protein in NAM8-GAR1 
intergenic region (SP:P38805) 
[Arabidopsis thaliana] 


3.2 


711 


Y I 6242 


Triticum aesiivum 
mKNA tor beta- 
amylase 


0.064 


1175958 


HVF01HL11LAL IV.s KD 
FKUIEIN IN AGP3-DAK3 
INTERGENIC REGION 
>gi|1084712|pir||S56201 
probable membrane protein 
YFL054c - yeast 
(Saccharomyces cerevisiae) 
>gi|836701|gnl|PID|d 1009825 
(D50617) YFL054C 


3.1 


712 


L81779 


Homo sapiens 
(subclone 2_a2 from 
PI H25) DNA 
sequence 


0.064 


3845169 


(AE001391) phosphatase (acid 
phosphatase family) 


0.8 i 


713 


X13826 ( 


C.reinhardtii psbl 
mRNA for OEE1 
protein of 
photosystem II 
oxygen-evolving 
enhancer protein) 


0,064 


171040 


(M94535) ATPase 
'Saccharomyces cerevisiae] 
cerevisiae, Peptide, 377 aa] 
[Saccharomvces cerevisiae] 


0.054 


714 


] 

X06487 I 


^.sapiens mRNA for 
5cl2-Ig fusion gene 


0.064 


( 

2429362 [ 


AF020261) proline rich protein 
Santalum album] 


0.016 


715 


i 

c 
i 
( 

U79638 e 


us musculus cyclin- 
lependent kinase 
nhibitor protein 
pl5(lNK4b)) gene, 
xon 2 and partial cds 


0.064 


( 

3929221 p 


AF082557) TRF1 -interacting 
nkyrin-related ADP-ribose 
>o!ymerase [Homo sapiens] 


le-10 
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Nearest Neishbor (BlastN vs. 


Genbank) 


_ Nearest Neighbor (BlastX vs. Non-Redundant Proteins* 


SEC 
ID 


) 1 

accession' description 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Human Tcell 










716 


U39099 


receptor alpha chain 
mRNA, panial cds 


U.UOJ 


<NONE> 


<NONE> 


<NONE> 


717 


U39673 


Clostridium 
acetobutylicum KdpC 
(kdpC) gene, partial 
cds. sensor histidine 
kinase homolog 
(kdpD) and response 
regulator homolog 
(kdpE) genes, 
complete cds 


0.063 


<NONE> 


<NONE> 


<NONE> 


718 


AL022317 


Human DNA 
sequence from clone 
140L1 on 

chromosome 22qI3.1- 
13.31, complete 
sequence [Homo 
sapiens] 


0.063 


1931640 


(U95973) Serine 
carboxypeptidase isolog 
[Arabidopsis thaliana] 


5.2 


719 


U28972 


Spiroplasma citri orfa 
and orff genes, partial 
cds, orfb, orfc, and 
orfe genes and 
Spiroplasma virus 
SpVl-derived ORFl 
and ORF3 genes, 
complete cds, and 
SpVl-derived ORF14 
gene, partial cds, 


0.063 


4091939 


(AF0707O4) envelope 
glycoprotein [Human 
mmunodeficiency virus type 1] 


5.2 


720 


U15159 


Mus musculus limk 
kinase (limk) mRNA, 
:omplete cds 


0.063 


3638957 


[AC004877) sco-spondin-mucin- 
ike; similar to P98167 uncertain 
Homo sapiens] 


5.1 


721 


] 
1 
r 
( 

AF058416 a 


Homo sapiens 
ipoprotein receptor- 
elated protein 
LRPl).exons 39, 40, 
ind4l 


0.063 


( 

1788123 f 


AE000276) orf, hypothetical 
>rotein [Escherichia coli] 


4.0 


722 


f 
f 
c 

s 

t 

AE0OI430 s 


Plasmodium 
alciparum 
hromosome 2, 
ection 67 of 73 of 
he complete 
equence 


0.063 


2244849 C 


Z97337) hypothetical protein 


4.0 
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Nearest Neighbor (BlastN vs. Genbank) 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



SEQ 

ID I ACCESSION 



DESCRIPTION 



Streptococcus 



P VALUE 



ACCESSION 



DESCRIPTION 



(Z70203) cDNA EST 



P VALUE 



723 1 L29323 



pneumoniae methyl 
transferase gene 
cluster, complete 
sequence 



0.063 



3874022 



EMBL:D72339 comes from this 
gene; cDNA EST 
EMBL:D75I97 comes from this 
tene [Caenorhabditis elegansl 



2.3 



724 1 X72631 



H.sapiens mRNA 
encoding Rev- 
ErbAalpha > :: 
emb|X72632|HSREV 
ERB2 H.sapiens 
mRNA encoding Rev 
ErbAalpha (internal 
fragment) 



0.063 



3979878 



predicted using 
Genefinder; cDNA EST 
EMBL:T01277 comes from this 
gene; cDNA EST 
EMBL:T0i796 comes from this 
gene; cDNA EST 
EMBL:D32545 comes from this 
gene; cDNA EST 
EMBL:D33060 comes from this 
gene; cD NA EST EMBL:D... 



1.7 



725 I U 17969 



Human initiation 
factor eIF-5A gene, 
complete cds. 



0.063 



2429509 



(AF025467) contains similarity 
to drosophila DNA-binding 
protein K 10 (NID:g8 148) 
Caenorhabditis elegans] 



1.4 



726 I AE001000 



Archaeoglobus 
fulgidus section 107 
of 172 of the 
complete genome 



0.063 



3462802 



(AF082486) nef protein [Human 
immunodeficiency virus type 1] 



0.35 



727 | S80986 



;vp[40J=svp-related 
nuclear 

receptor/retinoid 
signaling modulator 
[zebrafishes, mRNA, 
3876 nt] 



0.063 



1326288 



(U58734) weak similarity to 
ankyrin G [Caenorhabditis 
elegans] 



0.093 



728 | AF109134 



Homo sapiens 7-60 
mRNA, com piete cds 



0.063 



1083764 



proline-rich proteoglycan 2 
precursor, parotid - rat 
>gi|310200 (L17318) proline- 
rich proteoglycan [Rattus 
norvegicus] 



0.0O1 



729 | D87466 



Human mRNA for 
KIAA0276 gene, 
martial cds 



0.063 



2879865 



(AL021816) SPBC24E9.03C, 
unknown, len:251aa 
Schizosaccharomyces pombe] 



6e-05 



730 | ABO 18269 



Homo sapiens mRNA 
for KIAA0726 
protein, complete cds 



0.063 



2995865 



(AF053455) tetraspan TM4SF 
Homo sapiens] 



iomo sa pic 

YWTHetiCaL 41. 6 Kb 
PROTEIN C16C 10.5 IN 
CHROMOSOME III 

i|3S743S3|ghI|PID|e 1344077 
type (RING finger) 
Caenorhabditis elegans] 



2c- 1 6 



731 | DS6954 



Cricctulus griseus 
mRNA for 
Cytochrome P-450 
2A14, complete cds 



0.063 



2496S96 



le-22 
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Nearest Neighbor (BlastN vs. Genbank) 

seq" 

id 1 accession 



732 | AL010232 



733 1 U90714 



734 1 AF 107044 



735 | L41729 



DESCRIPTION 



Plasmodium 
falciparum DNA 
SEQUENCING IN 
PROGRESS *** 
from comig 4-58, 
complete sequence 



Vlycoplasma 
gaJIisepticum 
haemaggiutinin 
precursor genes, 
complete cds 
Homo sapiens clone 
pCL4 DNA-binding 
protein SOX21 
(SOX21)gene, 
c omplete cds 

Caenorhabditis 
elegans Ro 
ribonucleoprotein 
autoantigen mRNA, 
complete cds 



P VALUE 



_Nearest Nei ghbor (BlastX vs. Non-Redundant ProteinTT 
_ ACCESS ION | DESCRIPTION 



0.062 



<NONE> 



P VALUE! 



<NONE> 



0.062 



0.062 



736 | 299287 



Caenorhabditis 
elegans cosmid 
Y7A9D, complete 
sequence 
Caenorhabditis 
elegans] 



0.062 



<NONE> 



<NONE> 



2983060 



<NONE> 



<NONE> 



(AE00O687) putative protein 
[Aquifex aeolicusj 

hi — - 



TFU 



TVET 



0.062 



1176542 



ran 

J SERINE/THREONINE 
PROTEIN KINASE D 1044.3 
IN CHROMOSOME III 
>gi|4956S4 (U00065) contains 
|EGF-like repeats; highly similar 
I to ZC84.1; 3' exons similar to 
protein kinase [Caenorhabditis 
Jelegans] 



<NONE> 



<NONE> 



8.6 



737 I ABQ145I4 



Homo sapiens mRNA 
forKIAA06I4 
> rotein, partial cds 



738 1 L29I65 



■Human germline 
immunoglobulin Iighi 
chain variable region 
(lambda-IIIb 
subgroup) from IgM 
rheumatoid factor. 



739 I UQ9364 

740 | Y 1 6242 



Schistosoma 
japonicum Chinese 
clone pY6 
paramyosin mRNA, 

artial cds. 

Triticum aestivum 
mRNA for beta- 
lase 



0.062 



4033395 



0.062 



1914685 



0.062 



0.062 



13508.00 



DNA GYRASE SUB UNIT B 
subunit [Myxococcus xanthus] 



3.9 



(Y12014) RAD23 protein, 
isoform II 



798 34 



MITOCHONDRIAL 
RIBOSOMAL PROTEIN S5 
hypothetical protein 1246 (uvrA 
region) - Micrococcus luteus 
(fragment) 



1.3 



1.3 



0.59 
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Nearest Neighbor (BlastN vs. Gcnbank) 


Nearest Neighbor fBlastX vs. Non-Redundant Proteins} 


SEQ 
ID 


ACCESSION 


f DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Leishmania pifanoi 






TROPOMYOSIN I (TMI) 




741 


M97695 


cysteine proteinase 
(cys2) gene, complete 
cds. 


0.062 


1174754 


(POLYPEPTIDE 49) 
>gi|320989|pirj|A60607 
tropomyosin - fluke 


0.018 


742 


U67526 


Methanococcus 
jannaschii section 68 
of 150 of the 
complete genome 


0.062 


1330345 


tuSa/oo) coaea ror oy 
elegans cDNA yk34bl.5; coded 
for by C. elegans cDNA 
ykI3hl0.5; coded for by C. 
elegans cDNA yk46e8.5; coded 
for by C. elegans cDNA 
yk46d5.5; coded for by C. 
elegans cDNA yk43c2.5; coded 
for by C. elegans cDNA 
yk46e8.... 


le-40 


743 


Z78414 


Caenorhabditis 
elegans cosmid 
W09D12, complete 
sequence 
[Caenorhabditis 
elegans] 


0.061 


<NONE> 


<NONE> 


<NONE> 


744 


Y 13606 


Mus m use u las gene 
encoding filensin, 
exons 6. 7 


0.061 


2314715 


(AE000651) H. pylori predicted 
coding resion HP 1527 


4.9 


745 


J04374 


Eggplant mosaic 
virus genome. 


0.061 


141449 


HyPOTriEtl[CAL3S.SKt) 
PROTEIN IN TRANSPOSON 
TN4556 >gi|80759|pir||JQO43 1 
hypothetical 35. 5K protein - 
Streptomyces fradiae transposon 
Tn4556 


3.8 


746 


AB022200 


Marine obligately 
oligotrophic 
bacterium POO- 10 
DNA for 16S. 
ribosomal RNA T 
partial sequence 


0.061 


3983593 


(AB000307) transcarboxylase- 
beta 


2.2 


747 


X54250 


Rat mRNA for zinc 
finger protein AT- 
BP2, partial cds 


0.061 


1377886 


(L46SI5) DNA binding protein 
\c [Mus musculus] 


0.98 


748 


X69942 


Vl.musculus mRNA 
Df enhancer- trap- 
ocus I 


0.061 


2983969 


(AE00074S) putative protein 
Aquifex aeolicus] 


0.57 


749 


1 

AJ223206 


Mus musculus mRNA 
for scrapie responsive 
protein 1 


0.061 


{ 

4204265 


AC005223) 45643 
Arabidopsis thaliana) 


5e-31 


750 


Y I 0205 < 


-I. sapiens mRNA for 
3DSS protein 


0.060 


<NONE> 


<NONE> 


<NONE> 
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SEQ 
ID 



nearest Ne.phbor fBIastN vs. Genbank) I Nearer N^hn, < m, stX vs . Non-Redundanr E 



ACCESSION I DESCRIPTION 

751 I U7926Q 

752 I X07453 



Human clone 23745 
mRNA, com plete cds 
Plasmodium 
falciparum II- 1 gene 
parti 



753 I U575Q2 



755 1 X5I634 



756 AF072405 

757 I AFO 12899 

758 I AF09326 8 

759 1 X6I046 



Raitus norvegicus 
protein tyrosine 
phosphatase delta 
gene, catalytic 
domain, partial cds. 



Mfascicularis gene 
for apolipoproteinC 
III 

fseudomonas braB 
gene for branched 
chain amino acid , 
transport carrier (Livj 



P VALUE I ACCESSION 
<NONE> 
<NONE> 



0.060 



0.060 



0.060 



II) 



760 | AJ0Q5813 



761 | S79S43 



Gossypium hirsutum 
cotton fiber expressed 
protein 2 (CFE2) 
mRNA, complet e cds 

Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 
Rattus norvegicus 
homer- Ic mRNA, 
complete cds 
Hydra N-COL 2 
mRNA for mini 
collagen, partial cds 



Arabidopsis thaliana 
mRNA for 
neoxanthin cleavage 
enzyme 



{random amplified 
hybridization . 
microsatellite 
RAHM } [Beta 
vulgaris=sugar beets. 
Genomic, 537 nt] 



0.060 



0.059 



0.059 



0.056 



0.054 



DESCRIPTIO N 
<NONE> 
<NONE> 



(AF044915) polar tube protein 
3452285 PTP55 precursor 



SHUTTLE CRAFT PROTEIN 
J30843 |>gi|487400 



(U857I8) CCML [Pseudomonas 
1835622 putidaGB-11 



0.053 



alkaline phosphatase, 145K - 
423766 jSynechococcus sc 



(AF034859) juvenile hormone 
2662481 Jresistance protein 



547847 LECTIN PRECURSOR 



<NONE> <NONE> 



0.28 



2e-04 



8.1 



0.052 



<NONE> 



<NONE> 



0.025 



1730145 



GAMETOGENESIS 

EXPRESSED PROTEIN GEG- 

154 >gi|2l3733J|pir||I48361 
[gene GEG-154 protein - mouse 

>gi|550123 (X71642) 
|pid:g550123 [Mus musculusl 



4.7 



33 



7.0 



<NONE> 



<NONE> 



2e-l6 



WO 01/02568 



PCT7US00/18374 



" 1 Neares 

seqJ 

id iaccessio 


c Neighbor (BlastN vs. 
N DESCRIPTION 


Gen bank) 
P VALUE 


Nearest Neifh 

: ACCESSION 


ibor (BlastX vs. Non-Redundant Proteins) 

DESCRIPTION PVAMrp 


762 1 AB000096 

763 1 262366 


Mouse mRNA for 
ua i n-i protein, 
_ complete cds 

H.sapiens CpG DNA 
clone 67h7, forward 
reaa cpgo/n/.rtla . 


0.023 
0.023 


<NONE> 
3123312 


<NONE> 
ZINC FINGER PROTEIN 142 
(KIAA0236) to Human zinc 
. finger protein(ZNF142) [Homo 
sapiens] 


<NONE> 

5.9 


764 LI 1670 


Human 

transmembrane 
glycoprotein (CD53) 
gene, exons i tnrougr 
8. 


0.023 


. 80636 


hypothetical 67K protein - 
Mycobacterium fortuitum 
plasmid pAL5000 >gi| 149986 
(M60875) ORF2 


3.4 


765 D83984 


SuJcuIus diversicolor 
DNA fnr mn i;ir^ 

myoglobin, complete 
cds 


0.023 


3114665 


(AF061267) inner membrane 
component HtxE [Pseudomonas 
stutzert] 


3.4 


766 1 X988QO 


for inorganic 
phosphate 


0.023 


683532 


(X02155) thyroglobulin [Bos 
taurus] 


1.1 


767 I U58835 


Dissostichus mawsoni 
preprotrypsin gene, 
complete cds 


0.022 


<N0NE> 


<NONE> 


<NONE> 


768 J AJ009630 


Glomus versiforme 
chitin synthase gene 
(clone Gvchs3) 


0.022 


<NONE> 


<NONE> 


<NONE> 


769 J J04040 


Human glucagon 
mRNA, complete cds. 


0.022 


<NONE> 


<NONE> 


<NONE> 


1 1 

770 1 X74908 


L.esculenturn Asr3 
>ene 


0.022 


<NONE> 


<NONE> 


<NONE> 


I < 
1 ( 

I F 

b 

1 a 
I ( 

1 tr 

(i 
rt 

771 1 L07293 c 


Shigella dysenteriae 
3-antigen 
tolysaccharide 
iosynthesis rfbX. 0- 
ntigen polymerase 
rfc), rhamnosyl 
anferase I and II 
-fbR and rfbQ) and 
"bD genes, complete 
is. 


0.022 


<NONE> 


<NONE> 


cNONE> 



WO 01/02568 



PCT7US00/18374 





Nearest Neighbor TBIastN vs. Genbank) 


Nearest Neighbor fBlastX vs. Non-Redundant Protons) 


SEQ 
ID 


ACCESSKtt 


1 DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 




Muc mLKniltic 

*»lUO IIIUOWUILiJ 










772 


AF040094 


inositol 

polyphosphate 5- 
phosphatase II 
(INPP5P) mRNA, 
complete cds 


0.022 


<NONE> 


<NONE> 


<NONE> 


773 


X76776 


H.sapiens HLA-DMB 
gene 


0.022 


<NONE> 


<NONE> 


<NONE> 


774 


AE001521 


Helicobacter pylori, 
strain J99 section 82 
of 132 of the 
complete genome 


0.022 


<NONE> 


<NONE> 


<NONE> 


775 


X16004 


AJonga rbcL, rp!5, 
rps8, rp!36, rpsl4, 
rps2, trnl.trnF, trnC 
and rpoB (partial) 
genes > :: 

emb|X7565l|ALRIBP 
A.longa plastid genes 

fnr rihncnmnl 

proteins, tRNAs, 
RNA polymerase 
subunit beta and 
rubisco large subunit 


0.022 


<NONE> 


<NONE> 


<NONE> 


776 


Y I 2707 


cremoris plasmid 
pHW393 DNA, 
rlladii, mlladii genes 


0.022 


<NONE> 


<NONE> 


<NONE> 


777 


U27II8 


Arabidopsis thai i an a 

giuiamyi-u\iN/\ 

reductase 


0.022 


<NONE> 


<NONE> 


<NONE> 


778 


Z96622 


H.sapiens telomeric 
DNA sequence, clone 
5PTEL002, read 
5PTELOO002.seq 


0.022 


1 

191333 s 


[J05503) carbarnoyl-phosphate 
iynthetase (E.C.6.3.5 5) 


9.8 


779 


] 
i 

D83984 < 


Sulculus diversicolor 
DNA for IDO-like 
myoglobin, complete 
*ds 


0.022 


I 

1078509 


>robab!e membrane protein 
)TDR0i8c - yeast 


9.7 


780 


I 
c 
I 

Z77952 5 


-i.sapiens flow-sorted 
hromosome 6 
■iindlll fragment, 
iC6pA4A3 


0.022 


( 
s 

4204206 s 


AB022786) N-acetyl-beta-D- 
.lucosaminidase (Enterobacter 
P-l 


7.5 



WO 01/02568 
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Nearest Neighbor (BlastN vs. Genbank) 



SEQ 

id [acces sio n! descriptio n 

| Xe nop us laevis 



P VALUE 



Nearest Neiphbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



mitochondrial DNA, 
781 1 Ml 02 1 7 [complete genome. 



782 1 M55147 



Pea chloroplast 
glyceraJdehyde-3- 
I phosphate 
[dehydrogenase 
(Gpbl)gene, 
complete cds. 



0.022 



2145763 



DESCRIPTION 



P VALUE 



B2168_C2_205 protein- 
Mycobacterium leprae 



0.022 



417308 



PROBABLE HELICASE 
MOT1 Motlp is a probable 
helicase essential for vegetative 
[growth on rich glucose medium 
jat 30 degree C; Swiss-Prot 
Accession number P32333; 
similar to S. cerevisiae RAD26 
gene product: Swiss-Prot 
[Access ion number P40352 



7.3 



7831X58839 



Acholeplasma vims 
MV-L1 DNA for 
complete circular 
genome 



0.022 



3273189 



(AB008757) subunit II of 
c(o/b)3-type cytochrome c 
oxidase (Bacillus 
stea r othermo phi 1 us] 



784 1 M26185 



Mouse c-myb 
oncogene, exon 1 and 
lexon 2 (partial). 



0.022 



138592 



VllfclXOGENlNI 
PRECURSOR (YOLK 
PROTEIN 1) 
>gi|72270|pir||VJFFi 
I vitellogenin I precursor 
[unnamed protein product 
[Drosophila melanogaster] 



785 I AF06I195 



Streptomyces albus 
[valine dehydrogenase 
|(Vdh) gene, complete 
cds 



0.022 



2088768 



(AF003145) B0414.8 gene 
product [Caenorhabditis 
jelegansl 



2.5 



_786 | AF053622 



[Homo sapiens alpha 
jl,2-mannosidase IB 
gene, exon 9 



787 j Z71500 



IS.cerevisiae 
Ichromosome XIV 
[reading frame ORF 
YNL224c 



0.022 



1352361 



EARLY GROWTH 
RESPONSE PROTEIN I fish 
>gi[53I456 (U12895) egrl 
|[Danio rerio] rerio] 



0.022 



1708875 



PUTATIVE TUMOR 
SUPPRESSOR LUCA15 
sapiens] 



0.36 



788 



D1047I 



[Herpes simplex virus 
type 2 genomic DNA 
for 0.74-0.84 region, 
[complete cds 



0.022 



(AB0114S6) short ORF [TT 
3132276 [virus) 



789 



U43082 



IZea mays T 

[cytoplasm male 
sterility restorer 
factor 2 (rf2) mRNA, 

[complete cds 



0.022 



3319720 



(AL031035) putative aldehyde 
dehydrogenase [Streptomyces 
coelicolbrj 



0.0 1J 



WO 01/02568 
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! SEQ 
ID 



Nearest Neighbor (BlastN vs. Genbank) 



ACCESSION 



DESCRIPTION 



P VALUE 



H.sapiens simple 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



DESCRIPTION 



P VALUE 



790 



X86913 



tandem repeat DNA 
(clone wg3a6) 



0.021 



791 



AF100694 



Mus musculus 
Pomin52 mRNA. 
complete cds 



<NONE> 



<NONE> 



<NONE> 



0.021 



<NONE> 



<NONE> 



792 



U340I6 



Nannostomus sp. 
large subunit rRNA 
gene, mitochondrial 
gene encoding 
mitochondrial rRNA 
partial sequence, 



0.021 



<NONE> 



<NONE> 



793 



X00845 



Yeast mitochondrial 
genes for 15S rRNA 
and tRNA-Trp 



0.021 



<NONE> 



<NONE> 



794 



AB012U3 



Homo sapiens gene 
for CC chemokine 
PARC precursor, 
complete cds 



0.021 



<NONE> 



<NONE> 



795 



U62395 



Daucus carota 
globuiin-Iike protein 
(Gea8) gene, 
complete cds 



0.021 



<NONE> 



<NONE> 



796 



M 227 18 



falciparum actin II 
; iene, complete cds. 



0.021 



2623773 



(AF004835) tyrocidine 
synthetase 3 [Brevibacillus 
brevisj 



797 



U27118 



Arabidopsis thaliana 
glutamyl-tRNA 
reductase 



0.021 



3549885 



(AJ006631)cysteine-rich 
secretory protein- 1 [Equus 
caballus] 



798 



X99832 



H.sapiens CLN3 
gene, complete CDS 



0.021 



262249 



(S52010) orfl 5'ofEpoR [mice, 
Peptide. 85 aaj [Mus sp. 



799 



800 



AF0 16266 



Homo sapiens TRAIL 
receptor 2 mRNA, 
complete cds 



0.021 



729048 



Z92541 



Human DNA 
sequence from PAC 
179115, BRCA2 gene 
region chromosome 
13q 12-13 contains 
lactase-phlorizin 
hydrolase (LCT) 



0.021 



585820 



SUCCINYL- 
COA:COENZYME A 
TRANSFERASE transferase 
[Clost ridium kluyveri] 



HFUFULVSALtHAKlUh 1,2 
N- 

ACETYLGLUCOSAMINETR 
ANSFERASE >gi|466761 
(U00039) rfaK (Escherichia 
coli] >gi| 1790053 (AE000440) 
probably hexose transferase; 
lipopolysaccharide core 
biosynthesis 



8.7 



5,3 



WO 01/02568 



PCT/US00/18374 



Nearest Neighbor (BlastN vs. Genbank) 



SEQ 

id I accession! description 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



P VALUE | ACCESSION 



DESCRIPTION 



P VALUE 



{dopamine D2 



receptor [human, 
Ibrain, Genomic, 3794 
801 I S58588 nt, segment 4 of 51 



0.021 



2677620 



(Y08029) NAD(P)(+)-arginine 
ADP-ribosy I transferase 
(Oryctolagus cuniculusl 



802 



IRat nerve growth 
Ifactor-inducible 
Jprotein (VGF) gene, 
M60522 complete cds. 



0.021 



4103934 



(AF030050) replication factor C 
Rattus norvegicus] 



Gal [us gall us 
neuregulin beta- la 
803 I AF045654 ImRNA, complete cds 



0.021 



2746829 



(AF040647) No definition line 
found [Caenorhabditis elegans] 



804 I M69023 Human globin gene. 



0.021 



3880259 



(282056) T26H5.8 
'Caenorhabditis elegans] 
>gi|3880787|gnI|PID|e 1 350288 
(AL03262O) T26H5.8 



3.0 



805 | Z65960 



|H.sapiens CpG DNA 
clone 69d2, reverse 
read cpg69d2.rtlb . 



0.021 



1707245 



(U80845) similar to family 1 of 
G-protein coupled receptors 
Caenorhabditis elegans] 



806_|_X97073 



A.oligospora gene 
Jencodins lectin 



0.021 



116949 



CORE ANTIGEN 

gi|73601|pir||NKVLC2 core 
antigen - woodchuck hepatitis 
virus2>gi|336135 



807 | X5649I 



808 



ID. melanogaster 
mRNA for gene 
containing opa 
repetitive element 



[Homo sapiens 
(subclone I J6 from 
P1H31)DNA 
J-78760 |sequence 



0.021 



2842750 



0.021 



113671 



HOMEOBOX PROTEIN DLX- 
7 >gi[ 1620520 



! ! ! ! ALU CLASS F WARNING 
ENTRY !!!! 



0.16 



809 



Homo sapiens 
KIAA0404 mRNA, 
_AB 0Q7864 partial cds 



810 | AL021932 



Mycobacterium 
tuberculosis H37Rv 
Icomplete genome; 
Isegment 22/162 



0.021 



118144 



0.021 



2909514 



LiMfclWfcMmHAib A(U 
ACETYLSERINE 
SULFHYDRYLASE A) (O- 
ACETYLSERINE (THIOL) - 
LYASE A) (CSASE A) 
>gi|68323|pir||SYEBAC cysteine 
synthase (EC 4.2.99.8) A - 
Salmonella typhimurium 
>gi| 153935 (M21450) cysK 
protein [Salmonella 
typhimuriuml 



(AL021932) hypothetical 
protein RvQ43 9c 



o.i: 



7e-l0 



WO 01/02568 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















811 


U8999I 


Hypocrea jecorina 
mannose- 1 -phosphate 
guany 1 y I transferase 
(MPG1) mRNA, 
complete cds 


0.021 


3581924 


(AL031538) mannose- 1 - 
phosphate guanyltransferase 
[Schizosacchaxomyces pombe] 


6e-20 


812 


X00641 


Sugar beet 
mitochondrial 
minicircle pO 
sequence 


0.020 


<NONE> 


<NONE> 


<NONE> 


813 


Z50097 


D.melanogaster 
mRNA for hdc 
protein. 


0.020 


<NONE> 


<NONE> 


<NONE> 


3 1 A 


AF044866 


subunit ribosomal 
RN A gene, partial 
sequence; tRNA-Val 
gene, complete 
sequence; and small 
subunit ribosomal 
RNA gene, partial 
sequence, 

mitochondrial genes 
for mitochondrial 
RNAs 


0.020 


<NONE> 


<NONE> 


<NONE> 


815 


AF074386 


Sambucus nigra 
hevein-like protein 
mRNA, complete cds 


0.020 


<NONE> 


<NONE> 


<NONE> 


O 1U 


AF027174 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
B) mRNA, complete 
cds 


0.020 


<NONE> 


<NONE> 


<NONE> 


817 


AE001405 


Plasmodium 
falciparum 
chromosome 2, 
section 42 of 73 of 
the complete 
sequence 


0.020 


2196776 


(AF003342) bunched gene 
product [Drosophila 
melanogaster] 


8.4 


818 


AF074387 


Sambucus nigra 
hevein-like protein 
mRNA, complete cds 


0.020 


627071 


histidine-rich protein - 
Plasmodium lophurae 


2.S 



WO 01/02568 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















819 


Y13304 


Hylobates hoolock 
mitochondrial DNA 
for cytb gene, Horace 


0.020 


285580 


(D 10043) ORF [Acetobacter 
pasteurianus] 


2.1 


820 


Z66539 


H.sapiens creatine 
transporter gene 


0.020 


1703594 


[utiWM) coaea tor oy u 
elegans cDNA yk7c8.5; coded 
for by C. elegans cDNA 
ykI33b3.5; coded for byC 
elegans cDNA yk65a4.5; coded 
for by C. elegans cDNA 
yk7c8.3; coded for by C. 
elegans cDNA CEESQ66F; 
coded for by C. elegans cDNA 
yk65a4.3;... 


0.98 


821 


AF053622 


Homo sapiens alpha 
1,2-mannosidase IB 
gene, exon 9 


0.020 


1352361 


EARLY GROWTH 
RESPONSE PROTEIN I fish 
>gi|531456(U12895) egrl 
[Danio rerio] rcrio] 


0.72 


822 


M20555 


Human MHC class II 
HLA-DRw53-beta 
(DR4,w4) gene, 
exons 2,3,4,5,6. 


0.020 


465569 


HTHUlHhUUAL J5.1 KD 
PROTEIN IN SBCB-HISL 
INTERGENIC REGION 
>gi|405956 (U00009) 
ORF_ID:o349#4; similar to 
[SwissProt Accession Number 
P33015] [Escherichia coli] 
>gi|1736693|gnl|PID|dl016570 
Number P330 15] [Escherichia 
coli] >gi|1788323 (AE00O292) 
putative transport system 
permease protein [Escherichia 
coli] 


0.43 


823 


M20555 


Human MHC class II 
HLA-DRw53-beta 
(DR4,w4) gene, 
exons 2,3,4,5,6. 


0.020 


1709751 


COENZYME PQQ 
SYNTHESIS PROTEIN F 
synthesis F - Pseudomonas 
fluorescens >2i|929802 


0.42 



^00 



WO 01/02568 
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Nearest Neighbor (BlastN vs. Genbank) 



ACCESSION 



DESCRIPTION 



P VALUE 



824 | AJ0Q5015 



825 I AF034099 



826 1 API 00694 



827 | AF093268 



Homo sapiens mRNA 
for putative SMC-like 
protein, partial 



Laccaria bicolor 
glyoxal malate 
synthase protein 
mRNA. complete cds 



Mus musculus 
Pontin52 mRNA, 
complete cds 



Rattus norvegicus 
homer- Ic mRNA, 
complete cds 



0.020 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



267449 



0.020 



0.019 



0.019 



1109847 



132836 



2633401 



DESCRIPTION 



rROTCIKZK0j;.21N 

CHROMOSOME III 
>gi|102507|pir(lS15787 
hypothetical protein 1 (cosmid 
ZK637) - Caenorhabditis 
elegans Genefinder; cDNA EST 
yk217b5.3 comes from this 
gene; cDNA EST yk217b5.5 
comes from this gene; cDNA 
EST yk340g!2.3 comes from 
this gene; cDNA EST 
yk340gl2.5 comes from this 
gene; cDNA EST yk428c5.5 
co... 



(U41538) No definition line 
found [Caenorhabditis elegans] 



60S R1BOSOMAL PROTEIN 
L28 protein L28 [Rattus 
norvegicus] 



(Z99109) similar to DNA 
exonuclease 



Ie-12 



le-22 



5.7 



828 | AF10Q694 



829 I U67538 



Mus musculus 
Pontin52 mRNA, 
complete cds 



Methanococcus 
jannaschii section 80 
of 150 of the 
complete genome 



0.019 



2492604 



MULTIDRUG RESISTANCE 
PROTEIN CDR2 albicans] 



0.019 



1723566 



PU1 AHVh 

GLUCOSYLTRANSFERASE 
CI7C9.07 

gi|l 3 1 4I59|gnl|PED|e24 1 760 
(Z73099) SPAC17C9.07, 
putative glucosyl transferase len 
501, similar to 
SW: ALG8.YEAST P4035 1 
glucosyltransferasc ALGS 
pom be] 



4.4 



830 | U56088 



Human periodic 
tryptophan protein 2 
(PWP2) gene, exons 
3 to 14 



0.019 



2144804 



831 I U76524 



Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 



0.018 



1916976 



collagen alpha 1(11) chain 
bovine 



(U91682) vitelline membrane 
protein homolog [Aedes 
aegypti] 



0.040 



WO 01/02568 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















832 


AF026258 


Onobrychis viciirblia 
chalcone synthase 
(CHS) mRNA, 
complete cds 


0.018 


763076 


(Z48799) ZP3 [Cyprinus carpio] 
>gi|777724 (L41637) egg 
membrane protein [Cyprinus 
carpio] 


5.2 


833 


U95094 


Xenopus laevis XL- 
INCENP (XL- 
INCENP) mRNA, 
complete cds 


0.009 


3955011 


(AJ005438) beta adrenorcceptor 
B 


0.60 


834 


X71603 


L'.jejum VSl UNA > 

emb|A39603|A39603 
Sequence 2 from 
Patent W094 17205 > 
:: gb|I76090|I76090 
Sequence 2 from 
patent US 5691138 


0.008 


<NONE> 


<NONE> 


<NONE> 


835 


AF093268 


Rattus norvegicus 
homer- 1 c mRNA, 
complete cds 


0.008 


138116 


HEAD FIBER PROTEIN 
(LATE PROTEIN GP8.5) 
>gi|75846|pir||WMBP8H gene 
8.5 protein - phage PZA 
>gi|2l6057 (Ml 18 13) head 
fiber protein 


8.1 


836 


X91751 


Bovine herpesvirus 
type 1 UL7 gene 


0.008 


1711436 


SUPEROXIDE DISMUTASE 
(FE) 1.15.1.1) (Fe)- 
Pseudomonas aeruginosa 
>gi|409767 


5.9 


837 


M95594 


Arabidopsis thaliana 
1-aminocyclopropane- 
1-carboxylate 
synthase (ACS2) 
gene» complete cds. 


0.008 


683698 


(Z48229) orf 1 gene product 
[Saccharomyces cerevisiae] 


le-06 


838 


U67465 


Methanococcus 
jannaschii section 7 
of 150 of the 
complete genome 


0.008 


3874664 


(Z68493) predicted using 
Genefinder 


le-07 


839 


X72388 


B.taurus mRNA for 
filensin 


0.008 


100174 


1-aminocyclopropane- 1- 
carboxylate synthase 


7e-09 


840 


U22398 


Human Cdk-inhibitor 
p57KIP2 (KIP2) 
mRNA. complete cds. 


o.oos 


2228750 


(U93868) RNA polymerase III 
subunit [Homo sapiens] 


2e-lS 


841 


L42546 


Xenopus laevis LIM 
class homeodomain 
protein 


0.007 


<NONE> 


<NONE> 


<NONE> 



WO 01/02568 



PCT7US00/18374 





Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Pre 


)teins) 


SEQ 
rn 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Homo sapiens 










842 


AF04I428 


ribosomal protein s4 
X iso form gene, 
complete cds 


0.007 


<NONE> 


<NONE> 


<NONE> 


843 


AF000227 


Secale cereale omega 
secalin gene, 
complete cds 


0.007 


<NONE> 


<NONE> 




844 


D86254 


Human MHC (HLA) 
DRB intron i DNA, 
partial sequence 


0.007 


<NONE> 


<NONE> 


<NONE> 


845 


AF012899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


0.007 


<NONE> 


<NONE> 


<NONE> 


846 


Y07738 


M.musculus gene for 
vimentin 


0.007 


<NONE> 


<NONE> 


<NONE> 


847 


AJ005813 


Arabidopsis thaliana 
mRNA for 
neoxanthin cleavage 
enzyme 


0.007 


<NONE> 


<NONE> 


<NONE> 


848 


AF055119 


Homo sapiens alpha- 
tectorin(TECTA) 
gene, exon 6 


0.007 


<NONE> 


<NONE> 


<NONE> 


849 


M6U95 


Zucchini 1- 
aminocyclopropane- 1- 
carboxylate synthase 


0.007 


<NONE> 


<NONE> 


<NONE> 


850 


Y 11050 


Homo sapiens DSG3 
gene, partial intron 
and partial exon 6, 
140 bp 


0.007 


<NONE> 


<NONE> 


<NONE> 


851 


X61204 


M.voltae vnuD, 
vhuG, vhuA, vhuU & 
vhuB eenes 


0.007 


<NONE> 


<NONE> 


<NONE> 


852 


AB012105 


Brass ica rapa mRNA 
for SLG45, complete 
cds 


0.007 


<NONE> 


<NONE> 


<NONE> 


853 


S43S82 


telomere: 

| minichromosome, 
repeats ) 

[Trypanosoma brucew 
Genomic, 1 170 ml 


0.007 


<NONE> 


<NONE> 


<NOXE> 



403 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Pro 


teins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















854 


L32674 


Geomydoecus nadleri 
mitochondrial 
cytochrome oxidase I 
gene, partial cds. 


0.007 


<NONE> 


<NONE> 


<NONE> 


855 


U58732 


Caenorhabditis 
elegans cosmid 
F48D6. 


0.007 


<NONE> 


<NONE> 


<NONc> 


856 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


0.007. 


<NONE> 


<NONE> 


<NONE> 


857 


235284 


H.sapiens mRNA for 
MDR3 P- 
glycoprotein 


0.007 


1730696 


HYPOfHfiWAL 121.1 KD 
PROTEIN IN BI03-HXT17 
INTERGENIC REGION 
PRECURSOR YNR067c - yeast 
<Saccharomyces cerevisiae) 


9.5 


858 


X15217 


Human sno oncogene 
mRNA for snoA 
protein, ski-related 


0.007 


902455 


(U24203) membrane protein 
{Escherichia coli] 


8.S 


859 


AF027173 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
A) mRNA, complete 
cds 


0.007 


1684636 


(Y09454) ORF3 [Lactobacillus 
casei bacteriophage A21 


8.3 


860 


AFO 12899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


0.007 


3878803 


(248795) R05H5.7 
[Caenorhabditis elegans] 


8,3 


861 


S76317 


Tiy=lHU-2U6 kda 
membrane protein 
scavenger receptor 
homolog (clone 18, 
intron and flanking 
exons 14 and 15} 
[sheep, lymph node, 
lymphocytes, 
Genomic. 308 nt, 
segment 2 of 2] 


0.007 


294747 


(L08174) ORF2 
[Romanomermis culicivorax] 


7.4 



Ho4 
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Nearest Neighbor (BlastN vs. Genbank) 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



SEQ 

IP I ACCESSION 



DESCRIPTION 



P VALUE 



ACCESSION 



DESCRIPTION 



P VALUE 



862 | D88084 



Pedicuiaris 
verticil lata 
chloroplast DNA, 
intcrgcnic region 
between trnT(UGU) 
and tmL(UAA)5'exon 



0.007 



2555187 



(AF026789) vitellogenin 
[Pimpla nipponica) 



6.9 



863 1 X58869 



Chicken mRNA for 

aldehyde 

dehydrogenase 



0.007 



115978 



CD30L RECEPTOR 
PRECURSOR 
(LYMPHOCYTE 
ACTIVATION ANTIGEN 



6.5 



864 | D87120 



Homo sapiens mRNA 
for GS3786, complete 
cds 



0.007 



3879589 



cDNA EST EMBL:D35637 
comes from this gene; cDNA 
EST yk322a3.5 comes from this 
gene; cDNA EST yk397b2.5 
comes from this gene; cDNA 
EST yk348bl 1.5 comes from 
this gene; cDNA EST 
yk397b2.3 comes fr... 
>gi|3880965|gnl|PID|e 1350578 
comes from this gene; cDNA 
EST yk322a3.5 comes from this 
gene; cDNA EST yk397b2.5 
comes from this gene; cDNA 
EST yk348bll.5 comes from 
this gene; cDNA EST 
yk397b2 .3 comes ... 



5.1 



865 1 X68793 



H.sapiens gene for 
antithrombin III 



0.007 



2358285 



CAPO 10403) ALR [Homo 
sapiens] 



3.8 



866 | AJ001596 



Danio rerio mRNA 
for opioid receptor 
homologue 



0.007 



2507509 



HYPOTHETICAL '^.8 KD 
PROTEIN IN HOLB-PTSG 
INTERGENIC REGION 
>gi| 1787342 (AE000210) orf, 
hypothetical protein 
[Escherichia coli] protein in 
holB 3'region . [Escherichia 
coli] 



1.9 



867 | AF06U95 



Streptomyces albus 
valine dehydrogenase 
(Vdh) gene, complete 
cds 



0.007 



208S76S 



(AF003145) B0414.8 gene 
product [Caenorhabditis 
ele^a'ns] 



868 I AJ005813 



Arabidopsis thaliana 
mRNA for 
neoxanthin cleavage 
enzyme 



0.007 



1710105 



UDP-N- 

ACETYLGLUCOSAM1NE 2- 
EPIMERASE UDP-N- 
acetylglueosamine 2-epimerase 
Plasmid pWQ7991 



1.9 



1.7 



<-t05 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Zebra fish retinoic 










869 


L03398 


acid receptor alpha 
2.A 


0.007 


2239219 


(Z97210) hypothetical protein 


0.77 


870 


D63484 


Human mRNA for 
KIAA0150 gene, 
partial cds 


0.007 


19917 


(Z14014) Pistil extensin like 
protein, partial CDS only 


0.61 


871 


M31483 


Maize giyceraldehyde 
3- phosphate 
dehydrogenase, 3' 
end. 


0.007 


543068 


mucin, tracheobronchial - dog 
>gi|402558 


0.45 


872 


AF090U5 


Lycopersicon 
esculentum cytosolic 
class II small heat 
shock protein HCT2 
(HSP17.4) mRNA, 
complete cds 


0.007 


2494941 


ALPHA-2B ADRENERGIC 
RECEPTOR adrenoceptor 
[Cavia porcellus] 
>gi|1587159|prf]|2206293B 
adrenoceptor alpha2B [Cavia 
porcellus] 


0.42 


873 


AF064029 


Helianthus tuberosus 
lectin 1 mRNA, 
complete cds 


0.007 


1110587 


(S79410) nuclear localization ° 
signals Peptide, 140 aa] [Mus 
spl 


0.26 


874 


X88931 


H.sapiens PAL2A 
gene 


0.007 


1706176 


CUTINASE TRANSCRIPTION 
FACTOR 1 ALPHA 
>gi|12629I2 (U5I671) cutinase 
transcription factor 1 [Fusarium 
solani f. sp. pisi] 


0.21 


875 


S74I55 


zRAR alpha =retinoic 
acid receptor alpha 
zebrafish, embryos, 
mRNA, 1773 nt] 


0.007 


2239219 


(Z97210) hypothetical protein 


0.11 


876 


M74I93 


Petromyzon marinus 
plasma albumin 
mRNA, complete cds. 


0.007 


73088S 


OCTAPEPTIDE-REPEAT 
PROTEIN T2 


0.011 


877 


U03673 


Saccharomyces 
cerevisiae Spp41p 
(SPP41) gene, 
complete cds. 


0.007 


3820885 


(AL033126) 65G3.k 
'Drosophila melanogasterl 


0.001 


878 


D37766 


tfomo sapiens mRNA 
for Laminin-5 beta3 
chain, complete cds 


0.007 


1235974 


(X96713) collagen [Globodera 
pallidal 


3e-06 
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Nearest Neighbor (BlasiN vs. Genbank) 


_ Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


1 DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Caenorhabditis 










879 


AF022388 


elegans putative 
transcription factor 
MAB-3 (mab-3) 
gene, complete cds 


0.007 


3747107 


(AF09574I) unknown [Rattus 
norveeicus] 


5e-09 


880 


U89984 


Acanthamoeba 
castellanii 
transformation- 
sensitive protein 
homolog mRNA, 
complete cds 


0.007 


1890281 


(U89984) transformation- 
sensitive protein homoloa 


2e-09 


881 


AB020689 


Homo sapiens mRNA 
for KIAA0882 
protein, partial cds 


0.007 


3880809 


rabGAP domains; cDNA EST 
EMBL:D34945 comes from this 
gene; cDNA EST 
EMBL:D27313 comes from this 
gene; cDNA EST 
EMBL:D34829 comes from this 
gene; cDNA EST 
EMBL:D27312 comes from this 
gene; cDNA ... Probable 
rabGAP domains; cDNA EST 
EMBL:D34945 comes from this 
gene; cDNA EST 
EMBL:D27313 comes from this 
gene; cDNA EST 
EMBL:D34829 comes from this 
gene; cDNA EST 
EMBL:D27312 comes from this 
gene; cDNA ... 


le-23 


882 


AF 100694 


Mus musculus 
rontiro2 mRNA, 
complete cds 


0.006 


<NONE> 


<NONE> 


<NONE> 


883 


< 
( 

i 

AF027173 < 


Arabidopsis thaliana 
:ellulose synthase 
:atalytic subunit (Ath- 
\) mRNA, complete 
:ds 


0.006 


<NONE> 


<NONE> 


<NONE> 


884 


r 
F 

U76524 r 


Sambucus nigra 
ibosome inactivating 
irotein precursor 
nRNA, complete cds 


0.006 


<NONE> 


<NONE> 


<NONE> 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


nCCPO [DT1HM 

UfcoCKlr 1 IUIN 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















885 


U76524 


Sambucus nigra 
nbosome inactivating 
protein precursor 
mRNA. complete cds 


0.006 


<NONE> 


<NONE> 


<NONE> 


886 


AJ0O5813 


Arabidopsis thaliana 
mRNA for 
neoxanthin cleavage 
enzyme 


0.006 


<NONE> 


<NONE> 


<NONE> 


887 


AB012106 


Brassica rapa rnRNA 
for SRK45, complete 
cds 


0.006 


<NONE> 


<NONE> 


<NONE> 


888 


M80529 


Rattus norvegicus 
ceruloplasmin gene, 
exon 1 and 5' flank 


0.006 


<NONE> 


<NONE> 


<NONE> 


ooV 


AF027173 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
A) mRNA, complete 
cds 


0.006 


99408 


hypothetical protein 6 - 
Chlamydomonas reinhardtii 
transposon 

>gi|1360717|gnl|PID|e33461 
reinhardtii] 


9.6 




U76523 


Sambucus nigra lectin 
precursor mRNA, 
complete cds 


0.006 


4039024 


(AF0391 10) polyprotein 
[Rubella virus] 


9.3 


891 


AF093268 


Rattus norvegicus 
homer- lc mRNA, 
complete cds 


0.006 


160533 


(M9442S) merozoite surface 
antigen 1 [Plasmodium vivax] 


7.5 


892 


AB012106 


Brassica rapa mRNA 
for SRK45, complete 
cds 


0.006 


4019458 


(AF093984) envelope 
glycoprotein [Human 
immunodeficiency virus type 1] 


7.0 


893 


AJ005813 


Arabidopsis thaliana 
mKlNA tor 
neoxanthin cleavage 
enzyme 


0.006 


1916976 


(U91682) vitelline membrane 
protein homolog [Aedes 
aegypti] 


6.8 


894 


AF093268 


Rattus norvegicus 
homer- 1 c mRNA. 
complete cds 


0.006 


102059 


promastigote surface antigen- 2 
(clone 4.6) - Leishmania major 
(fragment) >gi|9583 (X57135) 
surface antigen P2 [Leishmania 
major] 


2.4 


895 


AF093268 


Rattus norvegicus 
homer- lc mRNA, 
:omplete cds 


0.006 


3171241 


(AF067204) transcription factor 
BF-l [Danio rerioj 


1.0 


896 


X993S4 


M.musculus mRNA 
for paladin sene 


0.003 


<NONE> 


<NONE> 


<NONE> 



?0* 
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Nearest Neighbor (BlastN vs. Genbank) 



Nearest Neighbor (BlasiX vs. Non-Redundant Proteins) 



SEQ 

K> I ACCESSION 1 DESCRIPTION 



P VALUE 



ACCESSION 



DESCRIPTION 



P VALUEl 



897 



lArabidopsis lhaliana 
[cellulose synthase 
Jcatalytic subunit (Ath- 
|B) mRNA, complete 
_AFQ27174 cds 



0.003 



<NONE> 



<NONE> 



<NONE>| 



iBorrelia burgdorferi 
(section 34 of 70) of 
898 I AEQ0U48 the complete genome 



0,003 



4160388 



899 



lArabidopsis thaliana 
cellulose synthase 
(catalytic subunit (Ath- 
A) mRNA, complete 
AF027173 cds 



900 



iLycopersicon 
esculentum class II 
small heat shock 
protein Le-HSPl7.6 
U72396 mRNA, complete cds 



0.003 



1709213 



0.002 



<NONE> 



(AJ0 11856) ORF Q0255 
(Saccharomyces cerevisiael 



7.6 



NUCLEAR ENVELOPE PORE 
MEMBRANE PROTEIN POM 
121 (PORE MEMBRANE 
PROTEIN OF 121 KD) (P145) 



<NONE> 



1.5 



901 



iMus musculus 
Pontin52 mRNA, 
AF100694 Icompletecds 



0.002 



<NONE> 



<NONE> 



Chlamydomonas 
reinhardtii light 
harvesting complex II 
protein precursor 
(Lhcb3) mRNA, 
902 I AF1 04631 complete cds 



0.002 



<NONE> 



<NONE> 



IMus musculus 
Pontin52 rnRNA, 
903 J AF100694 complete cds 



0.002 



<NONE> 



<NONE> 



904 



Brassica rapa mRNA 
for SRK45, complete 
AB012106 cds 



0.002 



<NONE> 



<NONE> 



905 



I Human non-histone 
chromosomal protein 
HMG-14 gene, 
M21339 complete cds. 



0.002 



<NONE> 



Sambucus nigra 
ribosome inactivating 
Iprotein precursor 
906 | AF012899 ImRNA, complete cds 



0.002 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 
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Nearest Neighbor (BlastN vs. Genbank) 



SEQ 

ID 1 ACCESSION I DESCRIPTION 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



P VALUE 



ACCESSION 



DESCRIPTION 



[Human h-lys gene for 



lysozyme (upstream 
907 I X57103 r^inn^ 



0.002 



<NONE> 



<NONE> 



Sambucus nigra 
hevcin-like protein 
908 1 AF074386 J mRNA. complete cds 



909 I UQ 1066 



Human CD4 
[promoter, partial 
sequence. 



0.002 



<NONE> 



0.002 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



910 | L28094 



Barley mRNA 
sequence. 



0.002 



<NONE> 



<NONE> 



<NONE> 
<NONE> 



911 



Homo sapiens ON A 
[from chromosome 19 
cosmid f 19399 (-17 
|kb EcoRI restriction 
AD0QQ833 Ifragment) 



0.002 



<NONE> 



Homo sapiens TRHR 
Igene promoter and 
-912 1 AJ011701 exons 1-2, partial 



Mus musculus 
JPontin52 mRNA, 
913 | A F100694 [complete cds 



0.002 



<NONE> 



0.002 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



Homo sapiens retinol 
dehydrogenase gene, 
914 I AF037062 complete cds 



0.002 



Rattus norvegicus 
homer- 1c mRNA, 
915 I AF093268 [complete cds 



<NONE> 



916 



IMethanococcus 
[jannaschii section 150 
of 150 of the 
U67608 [complete genome 



0.002 



<NONE> 



0.002 



<NONE> 



<NONE> 



<NONE> 



<NONE> ■ 



<NONE> 



<NONE> 



917 



lArabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath 
A) mRNA, complete 
AF027173 Icds 



918 



H.sapiens DNA for 
repeat region (ABM- 
Z46736 C82) 



0.002 



<NONE> 



IBrassica rapa mRNA 
[for SRK45, complete 
919 | ABO 12 106 Icds 



0.002 



<NONE> 



0.002 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



Y/0 
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Nearest Neiehbor (BlastN vs. Genbank) 


Nearest Neighbor {BtastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






X.laevis mRNA for 










920 


Z85983 


NOVA protein 


0.002 


<NONE> 


<NONE> 


<NONE> 


921 


AF027173 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
A) mRNA, complete 
cds 


0.002 


<NONE> 


<NONE> 


<NONE> 


922 


S61977 


medium-chain acyl- 
CoA dehydrogenase 
{exon 10, intron 10} 
[human, Genomic, 
1407 nt] 


0.002 


<NONE> 


<NONE> 


<NONE> 


923 


AJ0058I3 


Arabidopsis thaliana 
mRNA for 
neoxanthin cleavage 
enzyme 


0.002 


<NONE> 


<NONE> 


<NONE> 


924 


AB012105 


Brass tea rapa mRNA 
for SLG45, complete 
cds 


0.002 


<NONE> 


<NONE> 


<NONE> 


925 


AB012106 


Brasstca rapa mRNA 
for SRK45, complete 
cds 


0.002 


<NONE> 


<NONE> 


<NONE> 


926 


AF027173 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
A) mRNA, complete 
cds 


0.002 


<NONE> 


<NONE> 


<NONE> 


927 


. X51646 


H.sapiens DNA for 
dopamine D2 
receptor gene ' 


0.002 


3329125 


(AE001337) Yop C/Gen 
Secretion Protein D [Chlamydia 
trachomatis] 




$ 

92S 


\ 

/» 

AF 1 00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


0.002 


465762 


HYVUlHbULAL ll^.i isjj 
PROTEIN C06G4.1IN 
CHROMOSOME III 
>gi|630524|pir||S44748 
C06G4.1 protein - 
Caenorhabditis elegans 
>gi|409292 (L25598) homology 
with vigilin; coded for by C. 
elegans cDNA 

GenBank:M88954 (CEL12C9); 
putative [Caenorhabditis 


8.9 


929 


U4S47S 


Human skeletal 
muscle ryanodine 
receptor <*ene 


0.002 


2137221 


co-repressor protein - mouse 
>ai|6426l9 


6.9 
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Nearest Neighbor (BlascN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non- Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















930 


AF100694 


Pontin52 mRNA, 
complete cds 


0.002 


806536 


(Z22520) membrane protein 
[Bacillus acidopullulyticus] 


6.3 


931 


AF100694 


Mlk musculus 
Pontin52 mRNA, 
complete cds 


0.002 


3881055 


(AL023844) Y48A6B.1 
[Caenorhabditis elegans] 


5.8 


932 


AF090 U5 


Lycopersicon 
esculentum cyiosolic 
class II small heat 
shock protein HCT2 
fHSP17 4) mRNA. 
complete cds 


0,002 


3878330 


(Z81097) K07A1.4 
[Caenorhabditis elegansl 


4.8 


933 




Rattus norvegicus 
homer- 1c mRNA, 


0.002 


137640 


REPLICATION PROTEIN E I 
papillomavirus 


4.0 


934 


AF019660 


Mus musculus 
nuclear orphan 
receptor RORgamma 


0.002 


1330365 


(U58757) similar to nucleotide 
pyrophosphatases 


3.9 


935 


API 00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


0.002 


1785972 


(U4695DORF5; Method: 
conceptual translation supplied 
by author 


3.7 


936 


V00508 


Human oenp for 

epsilon-globin. 


0.002 


1333804 


(X56032) protease 
[Ruminococcus flavefaciens] 


3.5 


937 


ABO 12 105 


Brassica rapa mRNA 
for SLG45, complete 
cds 


0.002 


4153876 


(AC00553 1) similar to mouse 
homeodomain-interacting 
protein kinase 2; similar to 
AF077659 (PID:j?3702958) 


3.0 


938 


AJ005813 


Arabidopsis thai i an a 
mRNA for 
neoxanthin cleavage 
enzyme 


0.002 


1070461 


ornithine carbamoyltransferase 
(EC 2. 1.3.3) -yeast 
(Saccharomyces cerevisiae) 
>gi|929866 (X83502) 
pid:e 130025 [Saccharomyces 
cerevisiae] >ei| 1008256 


2.8 


939 


S41458 


rod cGMP 
phosphodiesterase 
beta-subunit [human, 
mRNA. 3231 nt] 


0.002 


3450883 


(AF083334) fibroin [Antheraea 
pernyi] 


1.6 
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Nearest Neiehbor (BlastN vs. Genbank) 


Nearest Neiehbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















940 


X06286 


melanogaster Gart 
locus with genes for 
GARS=phosphoribos 
ylamineglycine 
ligase, 

AIRS=phosphoribosy 

Iformylglycinamidine 

cyclo-ligase, 

GART=glycinamide 

ribotide 

transformylase > :: 
gb|J02527|DROGAR 
T D. melanogaster 
Gart gene encoding 
two polypeptides with 
GAR synthase, AER 
synthase, and GAR 
transformylase 
enzyme activities and 
a pupal cuticle gene 
nested within intron 
A of the Gart gene, 


0.002 


2662054 


(AB004651) isocitrate lyase 


1.5 


941 


AF015812 


Homo sapiens RNA 
helicase p63 
(HUMP6S) gene, 
complete cds 


0.002 


3641659 


(AB008374) alpha 3 type I 
collagen 


1.1 


942 


X78925 


H.sapiens HZF2 
mRNA for zinc finger 
protein 


0.002 


141624 


ZINC FINGER PROTEIN ZFP- 
37 (MALE GERM CELL 
SPECIFIC ZINC FINGER 
PROTEIN) 


1.0 


943 


AF074386 


Sambucus nigra 
hevein-like protein 
mRNA, complete cds 


0.002 


3879997 


(Z49071) weak similarity with 
mu-type opioid receptor (Swiss 
Prot accession number (P33535) 


1.0 


944 


Z69639 


Human DNA 
sequence from 
cosmid L241B9, 
Huntington's Disease 
Region, chromosome 
4pI6.3 contains 
3olymorphic VNTR 
PYNZ32. 


0.002 


3523162 


( AF076292) TGF-beta/activin 
signal transducer FAST- lp 


0.81 



H(3 
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Nearest Neighbor (BlastN vs. Gcnbank) 


Nearest Neighbor CBlastX vs. Non-Redundant Proteins) 


SEQ 

LU 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















945 


AP074387 


Sambucus nigra 
heve in-like protein 
mRNA, complete cds 


0.002 


2984161 


(AE000761) hypothetical 
protein (Aquifex aeolicus] 


0.80 


946 


AF093268 


Rattus norvegicus 
homer- 1c mRNA, 
complete cds 


0.002 


101830 


hypothetical protein B - chestnut 
blight fungus 


0.72 


947 


AF017307 


Homo sapiens Ets- 
related transcription 
factor (ERT) mRNA, 
complete cds 


0.002 


200531 


(Ml 8071) prion protein (Mus 
musculus] 


0.72 


948 


U1I383 


Drosophila 
melanogaster Ovo- 
I028aa (ovo) mRNA, 
complete cds. 


0.002 


2465207 


(AF016045) OVO-like 1 
binding protein (Homo sapiens] 


0.35 


949 


AF012899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


0.002 


3834294 


(U80846) No definition line 
found [Caenorhabditis elegans] 


0.29 


950 


AF086315 


Homo sapiens full 
length insert cDNA 
clone ZD52F10 


0.002 


545067 


(S68356) action potential 
broadening potassium 
channeI=Shab [Aplysia, bag cell 
neurons, head ganglia. Peptide, 
905 aa] [Aplysia] 
>gi|743110|prfl|2011375AK 
channel [Aplysia californica] 


0.15 


951 


X53096 


S. aureus genes 
encoding Sau96I 
DNA 

methyl transferase and 
Sau96I restriction 
endonuclease 


0.002 


2529575 


(AF018164) kinesin-like protein 
3C [Homo sapiens] 


0.11 


952 


AB0I2105 


Brassica rapa mRNA 
for SLG45, complete 
cds 


0.002 


729918 


LA PROTEIN HOMOLOG (LA 
RIBONUCLEOPROTEIN) (LA 
AUTOANTIGEN HOMOLOG) 


0.092 


953 


X73973 


G.gallus RAR- 
gamma2 mRNA for 
retinoic acid receptor 


0.002 


586122 


TRICHOHVAL1N 
>gi|42332l|pir||A40691 
trichohyalin - sheep >gi|295941 
(Z 18361) trichohyalin 


0.073 


954 


S4I458 


rod cGMP 
phosphodiesterase 
beta-subunit [human, 
mRNA, 323 1 nt] 


0.002 


1017427 


(X90569) elastic titin [Homo 
sapiens) 


0.013 
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Nearest 


Neighbor f BlastN vs. Genbank) 


_ Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSIOI^ 


} DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






D.melanosaster 






(U88169) similar to 




955 


M35887 


defective chorion- 1 
fcI25 (dec- 1) gene, 
complete cds. 


0.002 


1825606 


moiybdoterin biosynthesis 
MOEB proteins [Caenorhabditis 
elegans] 


0.008 


956 


AF034099 


Laccaria bicolor 
glyoxaJ malate 
synthase protein 
mRNA, complete cds 


0.002 


1825593 


(U88167) D2092.2 gene product 
[Caenorhabditis eleeans] 


le-06 


957 


AF033929 


Bactrocera dorsaiis 
strain Tahiti 
mitochondrial D-Ioop 
region, complete 
sequence 


9e-04 


<NONE> 


<NONE> 


<NONE> 


958 


AB012106 


Brassica rapa mRNA 
for SRK45, complete 
cds 


8e-04 


<NONE> 


. <NONE> 


<NONE> 


959 


AF029062 


Homo sapiens DEAD- 
box protein (BAT1) 
gene, partial cds 


8e-04 


<NONE> 


<NONE> 


<NONE> 


960 


U70671 


Human atx\in-2 
related protein 
mRNA, partial cds 


8e-04 


<NONE> 


<NONE> 


<NONE> 


961 


AF0517O9 


Dendrocopos 
leucopterus clone 2 
microsatellite HrU2 
repeat region 


8e-04 


<NONE> 


<NONE> 


<NONE> 


962 


X 14077 


Pea phy gene for 

photochrome 

apoprotein 


8e-04 


<NONE> 


<NONE> 


<NONE> 


963 


AC0O4497 


Homo sapiens 
chromosome 21, Pi 
done I RNTT #6 


oe-u*f 


AC *7 1 A £. 
HD 1 140 


(L27838) rhoptry protein 
[Plasmodium yoelii] 


9.6 


964 


< 

AF077344 I 


Homo sapiens 
:artilage-derived C- 
ype lectin 


8e-04 


3702123 


[AJ011707) TraD protein 
[Escherichia coli] 


S.5 


965 


] 

X85J17 f 


-L sapiens epb72 gene 
;xons 2,3.4,5,6,7 


8c-04 


< 

2570059 


[AJ004687) N-4 cytosine- 
ipecificmMethyltransferase 
Neisseria gonorrhoeae] 


6.8 


966 


f 
J 

AF100694 c 


Aus musculus 
> ontin52 mRNA, 
omplete cds 


8e-04 


< 
I 

F 
c 

1345S59 f 


ZOPPER TRANSPORT 
PROTEIN CTR1 transport 
)rotein - yeast (Saccharomyces 
erevisiae) gene product 
Saccharomyces cerevisiae] 


6.7 



+5 
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Nearest 


Neighbor (BlasiN vs. Cenbank) 


_ Nearest Neighbor (BlastX vs. Non-Redundant Proteins* 


SEQ 
ID 


ACCESSIOh 


I DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Homo sapiens 










967 


AF031403 


MLL/AF4 
translocation 
breakpoint 
t(4;ll)(q21;23) 


8e-04 


2498926 


SMALL PROTEIN B 
HOMOLOG A43259, from E. 
hirae [Mycoplasma 
pneumoniae) 


6.6 


968 


L29252 


Human (clone DI3-2] 
L-iditol-2 

dehydrogenase gene, 
exon 4, exon 5, exon 
6 and exon 7. 


8e-04 


1488070 


(U63997) putative transposase 
[Enterococcus faecium] 


5.2 


969 


a. i oyy j 


Mouse N10 gene for 
a nuclear hormonal 
binding receptor 


8e-04 


1493833 


(U47323) stromal cell protein 
'Mus musculus] 


3.2 


970 


M99412 


Human inter!eukin-8 
receptor (IL8RB) 
gene, complete cds 


8e-04 


1346101 


4-AMINOBUTYRATE 
AMINOTRANSFERASE 
TRANSAMINASE) (GAB A 
AMINOTRANSFERASE) 
homolog - smut fungus 
(Ustilago maydis) >gi|88l562 
Emericella nidulans gamma- 
amino-n-butyrate transaminase 
Swiss-Prot Accession Number 
PI 4010 [Ustilago maydis] 


0.83 


971 


U37452 


Human Down 
Syndrome region of 
chromosome 21 
genomic sequence, 
clone A31D6-1C5. 


8e-04 


4164069 


(AF1 1 1093) latrophilin 3 splice 
variant bbah [Bos taurus] 


0.26 


972 


1 
] 

AF 1 00694 c 


VIus musculus 
3 ontin52 mRNA, 
;omplete cds 


8e-04 


i 

1352877 


H XfiJi tit ULAL lj.U JUJ 

PROTEIN IN RAD26-GEF1 
INTERGENIC REGION 
>gi|l 07788 l|pir||S57057 
probable membrane protein 
VJR038c - yeast 
Saccharomyces cerevisiae) 
>gi|1015688 (Z49538) ORF 
rjR038c putative 
Saccharomyces cerevisiae] 


0.23 


973 


I 

r 

AF093268 c 


^attus norvegicus 
lomer-lc mRNA, 
omplete cds 


8e-04 


( 

1788557 f 


AE0003 12) orf f hypothetical 
>rotein [Escherichia coli] 


0.19 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















974 


X83872 


H. vulgaris mRNA for 
cAMP response 
element binding 
protein 


8e-04 


1175386 


HYPOTHETICAL^.; 1 KD 
PROTEIN C18B1 1.06 IN 
CHROMOSOME I 
>gi|2l30289|pir||S58305 
hypothetical protein 
SPAC 1 8B 1 1 .06 - fission yeast 
hypothetical protein 
[Schizosaccharomyces pombe] 


0.005 


975 


M32514 


Rat simple sequence 
DNA, clone 5. 


8e-04 


• 2394492 


(AF024502) No definition line 
found [Caenorhabditis elegans] 


0.002 


976 


AF074386 


Sambucus nigra 
hevein-like protein 
mRNA, complete cds 


8e-04 


2981631 


(AB012223) ORF2 [Canis 
familiarisl 


0.001 


977 


X89211 


H.sapiens DNA for 
endogenous retroviral 
like element 


8e-04 


2065210 


(Y 127 13) Pro-Pol-dUTPase 
polyprotein 


3e-04 


978 


U14391 


Human myosin-IC 
mRNA, complete cds. 


8e-04 


3142302 


(AC00241 1) Strong similarity to 
myosin heavy chain gb|234293 
from A. thaliana. [Arabidopsis 
thaliana] 


4e-16 


979 


L13612 • 


Drosophila 
melanogaster dead- 
box protein 
D.melanogaster 
DEAD-box gene, 
complete CDS 


8e-04 


3776027 


(AJ010475) RNA helicase 
[Arabidopsis thaliana] 


9e-24 


980 


AF074386 


Sambucus nigra 
he vein-like protein 
mRNA, complete cds 


7e-04 


<NONE> 


<NONE> 


<NONE> 


981 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


7e-04 


<NONE> 


<NONE> 


<NONE> 


982 


AF093268 


Rattus norvegicus 
homer- 1 c mRNA, 
complete cds 


7e-04 


<NONE> 


<NONE> 


<NONE> 


983 


Z739S7 


Human DNA 
sequence from 
cosmid N120B6 on 
chromosome 22 
Contains ESTs, 
complete sequence 
[Homo sapiens] 


7e-04 


<NONE> 


<NONE> 


<NONE> 



f<7 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Brassica rapa mRNA 










984 


AB012106 


for SRK45, complete 
cds 


7e-04 


<NONE> 


<NONE> 


<NONE> 


985 


AF093268 


Rattus norvegicus 
homer- Ic mRNA, 
complete cds 


7e-04 


<NONE> 


<NONE> 


<NONE> 


986 


AF027174 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
B) mRNA, complete 
cds 


7e-04 


<NONE> 


<NONE> 


<NONE> 


987 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


7e-04 


<NONE> 


<NONE> 


<NONE> 


988 


AJ005813 


Arabidopsis thaliana 
mRNA for 
neoxanthin cleavage 
enzyme 


7e-04 


<NONE> 


<NONE> 


<NONE> 


989 


AF064029 


Helianthus tubcrosus 
lectin 1 mRNA, 
complete cds 


7e-04 


<NONE> 


<NONE> 


<NONE> 


990 


AF027174 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
B) mRNA, complete 
cds 


7e-04 


<NONE> 


<NONE> 


<NONE> 


991 


AF027173 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
A) mRNA, complete 
cds 


7e-04 


<NONE> 


<NONE> 


<NONE> 


992 


AF064029 


Helianthus tuberosus 
lectin 1 mRNA, 
complete cds 


7e-04 


<NONE> 


<NONE> 


<NONE> 


993 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


7e-04 


<NONE> 


<NONE> 


<NONE> 


994 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


7e-04 


3327230 


(ABO 14608) KIAA0708 protein 
[Homo sapiens] 


9.5 
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Nearest Neiahbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


CCA 

[D 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















995 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


7e-04 


3327230 


(ABO 14608) KIAA0708 protein 
[Homo sapiens] 


9.3 


996 


AF074387 


Sambucus nigra 
hevein-Iike protein 
mRNA. complete cds 


7e-04 


3876455 


(Z93380) predicted using 
Genefinder; similar to 7tm 
receptor protein [Caenorhabditis 
elegans] 


7.1 


997 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


7e-04 


2128771 


hypothetical protein MJ1293 - 
Methanococcus jannaschii 
>gi|I591931 (U67570) M. 
jannaschii predicted coding 
region MJ1293 [Methanococcus 
jannaschii] 


6.2 


998 


U09412 


Human zinc finger 
protein 2NF134 
mRNA, complete cds 


7e-04 


1083336 


glutathione transferase (EC 
2.5.1.18) pi A - mouse 


5.4 


999 


AF027173 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
A) mRNA, complete 
cds 


7e-04 


473515 


(M17619) NADH 
dehydrogenase subunit ND4 
[Asterina pectinifera] 


3.7 


1000 


AFO 12899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


7e-04 


1724097 


(U79772) female sex protein 
[Mercurialis annua] 


3.3 


1001 


AF100694 


VI us musculus 
Pontin52 mRNA, 
complete cds 


. 7e-04 


U97103 


(D49747) core, env, and part of 
E2/NS1 


3.2 


1002 


X 16995 


Mouse N10 gene for 
a nuclear hormonal 
binding receptor 


7e-04 


345372 


unco protein, long rorm - 
Caenorhabditis elegans 
>gi|258529|bbs| 118648 
(S47168)UNC- 
5=immunoglobulin and 
thrombospondin type 1 
transmembrane protein 
{alternatively spliced} aa] 
Caenorhabditis elegans] 
>gi|2662596 (AF03669S) C. 
elegans UNC-5 (NID:g25852) 


2.7 
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Nearest Neinhbor (BlastN vs. Gcnbank) 


Nearest Neighbor fBlastX vs. Non- Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1003 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


7e-04 


4204220 


(AB022866) mobilization 
protein 


2.5 


1004 


AF093268 


Rattus norvegicus 
homer- lc mRNA, 
complete cds 


7e-04 


3201550 


(Y 17 11 6) fibrinogen-binding 
protein 


2.4 


1005 


AF074386 


Sambucus nigra 
he vein-like protein 
mRNA. complete cds 


7e-04 


** 1174264 


(U45966) polyprotein [Hepatitis 
G virus] 


0.73 


1006 


AF027173 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
A) mRNA, complete 
cds 


7e-04 


135308 


TRANSCRIPTION FACTOR 
JUN-D 


0.065 


1007 


X98745 


H.sapiens EWS gene, 
intron 6, 
polymorphism 


7e-04 


728836 


!!!! ALU SUBFAMILY SP 
WARNING ENTRY 


0.001 


1008 


AJ005813 


Arabidopsis thaliana 
mRNA for 
neoxanthin cleavage 
enzyme 


7e-04 


1633564 


(U47924) C8 [Homo sapiens] 


9e-09 


1009 


AF074386 


Sambucus nigra 
he vein- like protein 
mRNA, complete cds 


6e-04 


284171 


Ig epsilon chain C region form 3 
- human 


1.3 


1U1U 


AB012106 


Brassica rapa mRNA 
for SRK45, complete 
cds 


6e-04 


3845262 


(AE001414) BRAHMA 

OnnOlOg V.LylNrV nCUt-uiC 

superfamily II) 


0.25 


1011 


AL034404 


Human UNA 
sequence from clone 
417C12on 
chromosome Xp22.1 1 
22.2, complete 
sequence [Homo 
sapiens] 


3e-04 


' <NONE> 


<NONE> 


<N0NE> 


1012 


M99701 


Homo sapiens (pp21) 
mRNA, complete cds. 


3e-04 


<NONE> 


<NONE> 


<NONE> 
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Nearest Neighbor fBIastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 














1013. 


U00227 


Ovis arics Merino 
breed DR beta-chain 
antigen binding 
domain, MHC class I! 
DRB (Ovar-DRB24) 
gene, partial cds. 


3e-04 


<NONE> 


<NONE> 


<NONE> 


1014 


AF074387 


Sambucus nigra 
hevein-like protein 
mRNA, complete cds 


3e-04 


<NONE> 


<NONE> 


<NONE> 


1015 


U95102 


Xenopus lacvis 
mitotic 

phosphoprotein 90 
mRNA, complete cds 


3e-04 


999418 


(L 19655) ORF [Tomato 
ringspot virus] 


8.3 


1016 


AB012106 


Brass ica rapa mRNA 
for SRK45, complete 
cds 


3e-04 


2367460 


(AF0 11415) putative 
pheromone receptor [Mus 
musculus] 


7.0 


1017 


AJO 10737 


Mus musculus DNA 
for microsatellite 3kb 
upstream I bp gene 


3e-04 


4106549 


(AF104411) neuronal-specific 
septin 3 [Mus musculus] 


5.5 


1018 


AF053137 


Homo sapiens his tone 
deacetylase 3 gene, 
exons 4, 5, 6, 7, 8, 9, 
and 10 


3e-04 


416702 


NADH-DEPENDENT FLAVIN 
OXIDOREDUCTASE acid- 
inducible - Eubacterium sp 
>gi|I381570 (U574S9) 
NADH: flavin oxidoreductase 
[Eubacterium sp. VPI 127081 


5.3 


1019 


AF027173 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
A) mRNA, complete 
cds 


3e-04 


1785789 


(Y08502) orf 1 1 Id [Arabidopsis 
thaliana] 


5.1 


1020 


AC004173 


Homo sapiens clone 
UWGC:y23x01I 
from 6p21, complete 
sequence [Homo 
sapiens] 


3e-04 


558521 


(D28917) polyprotein [Hepatitis 
C virus] 


1.1 


1021 


X57025 


Human IGF-I mRNA 
for insulin-like 
smwth factor I 


3e-04 


4206707 


(AF1 18122) putative outer 
membrane protein OmpU 


0.65 


1022 


X77090 


H.sapiens IL-IRa 
aene. 


3e-04 


1065941 


(U40799) F42C5.7 gene product 
Caenorhabditis elegans] 


0.12 
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Nearest Neiehbor (BlasiN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non- Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Pseudorabies virus 










1023 


M34651 


with upstream and 

downstcam 

sequences. 


3e-04 


2746853 


(AF040650) contains similarity 
to sodium-potassium-chloride 
cotransport proteins 


7e-05 


1024 


Z36011 


S.cerevisiae 
chromosome II 
reading frame ORF 
YBR142w 


3e-04 


2500537 


PkOUABLb ATP- 
DEPENDENT RNA 
HELICASE HAS I 
>gi|626265|pir||S47451 
hypothetical protein YMR290c 
RNA helicase [Saccharomyces 
cerevisiae] 


4e-08 - 


1025 


AF020286 


Dictyostelium 
discoideum 2034 
gene, partial cds 


3e-04' 


1465834 


(U64857) No definition line 
found [Caenorhabditis elegans] 


6c- 14 


L026 


L26049 


Chlamydomonas 
reinhardtii dynein 
heavy chain alpha 
(ODA11) gene, exons 
2-15, and partial cds. 


3e-04 


3876775 


(Z81077) predicted using 
Genefinder; Similarity to Yeast 
protein 8248 (TR:G587531) 


9e-15 


1027 


AP020286 


Dictyostelium 
discoideum 2034 
gene, partial cds 


3e-04 


1465834 


(U64857) No definition line 
found [Caenorhabditis elegansl 


Ie-17 


1028 


X798U 


S.cerevisiae ACT3 
gene 


3e-04 


3876090 


UbVbjb) Similarity to Yeast 
uridine kinase 

(SW:URK 1_YE AST); cDNA 
EST EMBL:Z14695 comes 
from this gene; cDNA EST 
CEMSE17F comes from this 
gene; cDNA EST 
EMBL:D67355 comes from this 
gene; cDNA EST yk209hl.5 
comes from this ge... 


7c-31 


1029 


AF027173 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
A) mRNA, complete 
cds 


2e-04 


<NONE> 


<NONE> 


<NONE> 


1030 


M22970 


Human pancreatic 
phospholipase A-2 
(PLA-2) gene, exons 
1 to 3. 


2e-04 


<NONE> 


<NONE> 


<NONE> 
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Nearest Neishbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Pre 


He ins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Human UNA 










1031 


Z68686 


sequence from 
cosmid N2E9 on 
chromosome 22. 
Contains EST, 
complete sequence 
[Homo sapiens] 


2e-04 


<NONE> 


<NONE> 


<NONE> 


1032 


X95154 


H.sapiens brca2 gene 
exon 4 > :: 

emb|A62779|A62779 
Sequence 20 from 
Patent WO9719110 


2e-04 


" <NONE> 


<NONE> 


<NONE> 


1033 


AJ005813 


Arabidopsis thaliana 
mRNAfor 
neoxanthin cleavage 
enzvme 


2e-04 


<NONE> 


<NONE> 


<NONE> 


1034 


AF100694 


Mus muscutus 
Pontin52 mRNA, 
complete cds 


2e-04 


<NONE> 


<NONE> 


<NONE> 


1035 


AE001415 


Plasmodium 
falciparum 
chromosome 2, 
section 52 of 73 of 
the complete 
sequence 


2e-04 


<NONE> 


<NONE> 


<NONE> 


1036 


AF090115 


Lycopersicon 
esculentum cytosolic 
class II small heat 
shock protein HCT2 
(HSPJ7.4) mRNA, 
complete cds 


2e-04 


<NONE> 


<NONE> 


<NONE> 


1037 


AC000958 


Homo sapiens 
(subclone 6_d9 from 
P1H21)DNA 
sequence 


2e-04 


<NONE> 


<NONE> 


<NONE> 


1038 


AF093268 


Rattus norvegicus 
homer- Ic mRNA, 
complete cds 


2e-04 


2501523 


CD59 GLYCOPROTEIN 
PRECURSOR 


7.1 


1039 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


2e-04 


2765360 


(Y 13925) cathepsin L2 [Penaeus 
vannameil 


6.S 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












RNaK)LYMERaSE 




1040 


AP027174 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Alh- 
B) mRNA, complete 
cds 


2e-04 


133636 


>gi|67126|pir||RRXPLC RNA- 
directed RNA polymerase (EC 
2.7.7.48) - lymphocytic 
choriomeningitis virus (strain 
Armstrong 53b) >gi|33 1369 


5.2 


1041 


ABO 12 106 


Brassica rapa mRNA 
for SRK45, complete 
cds 


2e-04 


3822155 


(AF074613) type II secretion 
protein [Escherichia coli 
0157:H7] 


4.0 


1042 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


2e-04 


1718125 


REGULATORY PROTEIN E2 
>gi| 1020222 type 36] 


0.38 


1043 


X 17058 


Sus scrofa mRNA for 
glucose transporte 
protein 


2e-04 


3341906 


(AB009593) xylose transporter 


2e-15 


1044 


AF008216 


Homo sapiens 
candidate tumor 
suppressor pp32rl 


le-04 


<NONE> 


<NONE> 


<NONE> 


1045 


X98890 


S.tuberosum mRNA 
for inorganic 
phosphate 
transporter. StPTl 


le-04 


624126 


(U42580) a65L [Paramecium 
bursaria Chlorella vims 1] 


7.9 


1046 


L 14930 


Glycine max (Rab7p) 
mRNA. complete cds. 


9e-05 


<NONE> 


<NONE> 


<NONE> 


1047 


AJ009970 


Mus musculus 
thromboxane A2 
receptor gene, exon 3, 
partial 


9e-05 


<NONE> 


<NONE> 


<NONE> 


1048 


Y11896 


M.musculus mRNA 
for Brx gene, partial 


9e-05 


<NONE> 


<NONE> 


<NONE> 


1049 


L10832 


Polistes annularis 
(clone pan48AAT) 
tandem repeat region. 


9e-05 


<NONE> 


<NONE> 


<NONE> 


1050 


AF055011 


Homo sapiens clone 
24587 mRNA 
sequence 


9e-05 


3S805S6 


(Z/y/MSjcDNALM 
EMBL:D28009 comes from this 
gene; cDNA EST 
EMBL:D28008 comes from this 
gene; cDNA EST 
EMBL:D32478 comes from this 
gene; cDNA EST 
EMBL:D34508 comes from this 
gene; cDNA EST 
EMBL:D37581 comes from this 
gene; ... 


7.6 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1051 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


9e-05 


3024292 


RHODOPSIN>gi|2290717 
(AF000947) rhodopsin [Sepia 
officinalis] 


6.7 


1052 


Z58294 


H. sapiens CpG DNA, 
clone 34d6, forward 
read cpg34d6.ftla . 


9e-05 


3885496 


(AF064825) heparin/heparan 
sulfate N-acetylglucosaminyl N- 
deacetylase/N-sulfotransferase 
[Bos taurus] 


0.65 


1053 


D87451 


Human mRNA for 
KIAA0262 gene, 
complete cds 


9e-05 


" 3874739 


(266495) similar to claustrin 
like 


0.004 


1054 


L37092 


Mus musculus cyclin- 
dependent kinase 
homologue 


9e-05 


3080513 


(AL022598) hypothetical 
protein 


4e-09 


1055 


AF074386 


Sambucus nigra 
hevein-like protein 
mRNA, complete cds 


8e-05 


<NONE> 


<NONE> 


<NONE> 


1056 


AF027174 


Arabidopsis lha liana 
cellulose synthase 
catalytic subunit (Ath- 
B) mRNA, complete 
cds 


8e-05 


<NONE> 


<NONE> 


<NONE> 


1057 


AF074386 


Sambucus nigra 
hevein-like protein 
mRNA, complete cds 


8e-05 


<NONE> 


<NONE> 


<NONE> 


1058 


D10102 


Homo sapiens DNA 
from cosmid 
clone: 844, GT repeat 
sequence 


8e-05 


<NONE> 


<NONE> 


<NONE> 


1059 


U72396 


Lycopersicon 
esculentum class II 
small heat shock 
protein Le-HSP17.6 
mRNA. complete cds 


8e-05 


1176475 


HYFUTHL11CAL 8U.4 KJJ 
PROTEIN IN SMC3-MRPL8 
INTERGENIC REGION 
>gi|1078237|pir||S56849 
probable membrane protein 
YJL073w - yeast 
(Saccharomyces cerevisiae) 
>gi|895898 (X8885i) 
hypothetical protein YJL073w 
[Saccharomyces cerevisiae] 


6.0 


1060 


X71934 


H. sapiens XB gene . 
for tenascin-X, repeat 
XIII 


8e-05 


285207 


microtubule-associated protein, 
I10K tau - rat >gi|207158 
(M84156) big tau [Rattus 1 
norvegicus] 


3.7 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. No n- Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1061 


AF027174 


AnhiHnnsi^ thnlinnn 
f\i tiuiwysi i iiiaiitiuu 

cellulose synthase 
catalytic subunit (Ath- 
B) mRNA, complete 
cds 


8e-05 


4049682 


(AF063866) ORF MSV092 
hypothetical protein 
[Meianoplus sanguinipes 
entomopox virus] 


2.1 


1062 


AF090115 


esculentum cytosolic 
class II small heat 
shock protein HCT2 

priori / IT1KXNA, 

complete cds 


8e-05 


3861019 


/ L) unitnown 
[Rickettsia prowazekii] 


5e-14 


1063 


AF027174 


Arabidopsis thaliana 
cellulose synthase 

B) mRNA, complete 
cds 


7e-05 


<NONE> 


<NONE> 


<NONE> 


1064 


L04193 


Human lens 
membrane protein 
(mpl9) gene, exon 
11. 


7e-05 


<NONE> 


<NONE> 


<NONE> 


1065 


X61609 


B.napus gene for 
LHC II Type III 
chlorophyll a/b 
binding protein 


7e-05 


2132314 


hypothetical protein YPR174c - 
yeast similarity to a nuclear 
lamin from C. elegans (PIR 
accession numocr j / j 
[Saccharomyces cerevisiae] 


8.9 


1066 


AF064029 


Helianthus tuberosus 
lectin 1 mRNA, 
complete cds 


7e-05 


2979422 


(AB006757) PCDH7 (BH- 
Pcdh)c [Homo sapiens] 


5.7 


1067 


AF027173 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
A) mRNA, complete 
cds 


7e-05 


2493696 


HYPOTHETICAL 21.5 KD 
PROTEIN (ORF 185) 
>gi| 1480440 (U34204) 
ORF185; hypothetical 2L4 kD 
protein [Brassica oleracea] 


5.2 


1068 


AF093268 


Rattus norvegicus 
homer- lc mRNA, 
complete cds 


7e-05 


2501029 


PROBABLE LEUCYL-TRNA 
SYNTHETASE, 
MITOCHONDRIAL 
PRECURSOR (LEUCINE— 
TRNA LIGASE) (LEURS) 
KIAA0028 [Homo sapiens] 


1.4 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Human DNA 










1069 


Z68758 


sequence from 
cosmid cN85El0on 
chromosome 22ql 1.2- 
qter 


3e-05 


<NONE> 


<NONE> 


<NONE> 


1070 


X60653 


human Histone H3.3 
pseudogene (CIR- 
456) 


3e-05 


<NONE> 


<NONE> 


<NONE> 


1071 


Z58294 


H.sapiens CpG DNA, 
clone 34d6, forward 
read cpg34d6.ftla . 


3e-05 


' ' 1706241 


GUANYLYL CYCLASE GC-E 
PRECURSOR cyclase receptor 
[Mus musculus] 


9.6 


1072 


AF043251 


Homo sapiens 
mitochondrial outer 
membrane protein 
(Tom40) gene, 
nuclear gene 
encoding 
mitochondrial 
protein, exons 1 
through 6 


3e-05 


113980 


AMINE OXIDASE [FLAVIN- 
CONTAINING] B oxidase 
(flavin-containing) (EC 1.4.3.4) 
B - human B [human, platelet, 
Peptide Partial, 520 aa] [Homo 
sapiens] 


8.9 


1073 


M3I104 


Chicken progesterone 
receptor gene, 
encoding forms A and 
B. exons 1 and 2. 


3e-05 


1170841 


IG GAMMA LAMBDA 
CHAIN V-II REGION 


4.8 


1074 


AFO 12899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


3e-05 


543684 


ribosomal protein S3 - 
Chlamydomonas humicola 
chloroplast (fragment) 


4.2 


1075 


L22206 


Human vasopressin 
receptor V2 gene,, 
complete cds. 


3e-05 


791207 


(U20615) Gnotl homeodomain 
protein [Gall us gallus] 


1.8 


1076 


AF093268 


Rattus norvegicus 
homer- lc mRNA, 
complete cds 


3e-05 


3237340 


(AF033361) polyprotein 
[Hepatitis C virus] 


0.94 


1077 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


3e-05 


2879805 


(AL021813) hypothetical 
protein 


0.001 


1078 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


3e-05 


3877951 


(ZS1555) predicted using 
Genefinder 


3e-07 
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Nearest Neighbor (BlastN vs. Gcnbank) 


Nearest Neighbor (BlasiX vs. Non-Redundant Pro 


teins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


nccrp TPTTOM 


P VALUE 
















1079 


AF090115 


Lycopersicon 
esculentum cytosolic 
class II small heat 
shock protein HCT2 
(HSP17.4) mRNA, 
complete cds 


2e-05 


<NONE> 


<NONE> 


<NONE> 


1080 


AF064029 


Helianthus tuberosus 
lectin 1 mRNA, 
complete cds 


2e-05 


3880197 


(Z8U32) predicted using 
Genefinder 


2.4 


1081 


AF087989 


Homo sapiens full 
length insert cDNA 
clone YX29D10 


2e-05 


113667 


!!!! ALU CLASS B WARNING 
ENTRY!!!! 


1.8 ■ 


1082 


AF064029 


Helianthus tuberosus 
lectin 1 mRNA, 
complete cds 


2e-05 


474896 


(L31967) mating type protein 
[Coprinus cinereus] 


1.4 


1083 


AF064029 


Helianthus tuberosus 
lectin 1 mRNA, 
complete cds 


2e-05 


2266988 


(Y 13274) M33 poiycomb-like 
protein [Mus musculus] 


0.62 


1084 


U67415 


Equus caballus UCD- 
E-CA-467 
dinucleotide repeat 
region, complete 
sequence 


le-05 


<NONE> 


<NONE> 


<NONE> 


1085 


X67277 


H.sapiens BGP gene 
for biliary 
glycoprotein, 
promoter region and 
exon 1 


le-05 


<NONE> 


<NONE> 


<NONE> 


1086 


X85117 


H.sapiens epb72 gene 
exons 2,3.4,5,6,7 


le-05 


<NONE> 


<NONE> 


<NONE> 


1087 


U88328 


Mus musculus 
suppressor of 
cytokine signalling- 3 


le-05 


443877 


(70Q& e \l\ rnre region* 

pid:g443877 [Hepatitis C virus] 
virus] 


3.9 


1088 


Y 12853 


Homo sapiens P2X7 
gene, exon 4-8 


le-05 


3878726 


(Z66498) similar to cuticle 
collagen; cDNA EST 
EMBL:D75584 comes from this 
aene 


0.36 


1089 


AE00L140 


Borrelia burgdorferi 
(section 26 of 70) of 
the complete genome 


le-05 


3860719 


(AJ235270) GLUTAMYL- 
tRNA AMIDOTRANSFERASE 
SUBUNIT A (gatA) [Rickettsia 
prowazekiil 


4e-l5 
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Nearest Neiehbor fBlastN vs. Genbank) 


Nearest Neiehbor (BlastX vs. Non-Redundant Pre 


)teins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1090 


AJ224U2 


Homo sapiens gamma 
adaptin gene, exon 2 
and flanking imronic 
sequences 


9e-06 


<NONE> 


<NONE> 


vlN WiN 


1091 


AB000565 


Homo sapiens DNA 
for repeat sequence 
Alu . 


9e-06 


72879 


translation initiation factor IF-2 - 
Escherichia coli 


5.1 


1092 


Z78985 


H.sapiens flow-sorted 
chromosome 6 
Hindlll fragment, 
SC6pA20B4 


9e-06 


' 159975 


(M65164) 5IC surface protein 
[Paramecium tetraurelia] 


4.8 


1093 


221677 


Thermotoga maritima 
DNA for spc operon 


9e-06 


585879 


50S RIBOSOMAL PROTEIN 
L2 maritima >gi|437926 
(Z21677) ribosomal protein L2 


7e-14 


1094 


AF031494 


Drosophila hydei 
Dhc7 (Threads) 
mRNA, complete cds 


9e-06 


729377 


DYNEIN BETA CHAIN, 
CILIARY sea urchin 
(Anthocidaris crassispina) chain 
[Anthocidaris crassispina] 


4e-18 


1095 


AF051315 


Homo sapiens 
placental protein 
17al (PP17) mRNA, 
complete cds 


4e-06 


<NONE> 


<NONE> 




1096 


AC001460 


Homo sapiens 
(subclone 2_f4 from 
BACH 107) DNA 
sequence 


4e-06 


2648304 


(AEG00952) ISA1214-6. 
putative transposase 


6.2 


1097 


X85030 


H.sapiens mRNA for 
skeletal muscle- 
specific calpain 


4e-06 


4239857 


(AB016726) calpain 
[Schistosoma japonicum] 


0.006 


1098 


M75162 


Human polymorphic 
arylamine N- 
acetyltransferase 


3e-06 


<NONE> 


<NONE> 


<NONE> 


1099 


AB009999 


Rattus norvegicus 
mRNA for CDP- 
diacylglycerol 
synthase, complete 
cds 


3e-06 


3879045 


(Z70309)RI02.6 
[Caenorhabditis elegans] 


7.3 


1 100 


Z7S985 


H.sapiens flow-sorted 
chromosome 6 
Hindlll fragment, 
SC6pA20B4 


3e-06 


266529 


MERCURIC REDUCTASE 
(HG(II) REDUCTASE) 
>gi|418744|pir||S3016S 
mercury (II) reductase 


6.5 


1101 


AB012190 


Homo sapiens mRNA 
for NeddS-activating 
enzyme hUba3, 
complete cds 


3e-06 


3877938 


(Z79697) F58H10.1 
[Caenorhabditis elegans] 


6.3 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Homo sapiens 










U02 


AF041056 


WSCR4 gene, exons 
3 and 4 


3e-06 


1568583 


(Z80775) hypothetical protein 
Rv0044c 


t Q 


t ifn 

l IUj 


X0O777 


Mouse E(d) beta gene 
5' flanking region and 
exon 1 


3e-06 


1680722 


(U724V/) tatty acta amice 
hydrolase [Rattus norvegicus] 


0.008 


1104 


D21205 


Human mKNA tor 
estrogen responsive 
finger protein, 
complete cds 


3e-06 


563127 


(U09825) acid finger protein 
[Homo sapiens] 


le-05 


1105 


Z47046 


Human cosmid 
QLL2C9 from Xq28 


le-06 


* <NONE> 


<INUINfc> 


<InLHN.c> 


1106 


L26261 


Human MHC class III 
HLA-RP1 gene. 


le-06" 


<NONE> 


<NONE> 


<NONE> 


1107 


Ml 3402 


Rat 5S RNA gene, 
clone 5S-2. 


le-06 


<NONE> 


<NUN£> 


<I\UIN£> 


1108 


X68793 


H.sapiens gene for 
antithrombin III 


le-06 


<NONE> 


<NONE> 


<NONE> 


1109 


AF003540 


Homo sapiens 
Krueppel family zinc 
finger protein 


le-06 


2507553 


ZINC FINGER PROTEIN 33A 
(ZINC FINGER PROTEIN 
KOX31)(KIAA0065) 
(HA0946) Kruppel-related. 
[Homo sapiens] 


0.098 


1 1 10 


L42096 


Homo sapiens 
(subclone I0_d2 from 
PI H21) DNA 
sequence. 


le-06 


1330401 


(U58762) T27F7.1 gene product 
[Caenorhabditis elegans] 


0.015 


1111 


Z69925 


Human UNA 
sequence from 
cosmid cNl 16A5, 
between markers 

— O— Ov ill IU 

D22S86 on 
chromosome 22ql2 
contains EST 


9e-07 


<NONE> 


<NONE> 


<NONE> 


1112 


D90217 


S, cerevisiae gene for 
YmL33. 
mitochondrial 
ribosomal proteins of 
large subunit 


9e-07 


3879097 


(Z81109) predicted using 
Gene finder; similar to 
sodium/phosphate transporter; 
cDNA EST yk326f6.3 comes 
from this gene; cDNA EST 
yk326f6.5 comes from this gene 
fCaenorhabditis elegans] 


7.1 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non- Redundant Proteins) 


SEQ 

IT"\ 
UJ 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












(Loans; coded ror oy u 




1113 


AFO 12899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


9e-07 


1330345 


elegans cDNA yfcJ4bl .o; coded 
for by C elegans cDNA 
ykl3hl0.5; coded for by C. 
elegans cDNA yk46e8.5; coded 
for by C. elegans cDNA 
yk46d5.5; coded for by C. 
elegans cDNA yk43c2.5; coded 
for by C. elegans cDNA 
yk46e8.... 


2e-29 


1114 


AF086562 


Homo sapiens full 
length insert cDNA 
clone ZE16C03 


4e-07 


1072210 


(U40945) coded for by C. 
elegans cDNA yk74b9.3; coded 
for by C. elegans cDNA 
yk74b9.5; similar to repeat of 
calcium channel alpha subuntts; 
similar to tetracycline resistance 
protein; similar to hypothetical 
protein in HSP30-PMP1 region 
(SP... 


3.9 


1115 


L39062 


Homo sapiens 
interleukin 9 receptor 
IL9R pseudogene, 
exons 1-9 


4e-07 


3879983 


(Z46/9i>) similar to 
transforming protein etc2; 
cDNA EST EMBL:D34137 
comes from this gene; cDNA 
EST EMBL:D37172 comes 
from this gene; cDNA EST 
EMBL:D76266 comes from this 
gene; cDNA EST 
EMBL:D70493 comes from this 
gene; cDNA ... 


3.3 


1116 


Z69364 


Human DNA 
sequence from 
cosmid L96F8, 
fiuniingions uisease 
Region, chromosome 
4pl6.3 contains EST 
and cDNA. > :: 
emb|Z69365|HSL96F 
8A Human DNA 
sequence from 
cosmid L96F8, 
Huntington's Disease 
Region, chromosome 
4p 16.3 contains EST 
and cDNA. 


4e-07 


3493176 


(AF022SS9) latent TGF beta 
binding protein [Mus musculus] 


3.0 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Human mRNA for 










1117 


D79986 


KIAA0164gcne, 
complete cds 


4e-07 


4038031 


(AC005936) hypothetical 
protein [Arabidopsis thaliana] 


0.30 


111S 


D43950 


Human mRNA for 
KIAA0098 gene, 
partial cds 


3e-07 


<NONE> 


<NONE> 


<NONE> 


1119 


AF037168 


Arabidopsis thaliana 
DnaJ homologue 
(AU6) mRNA, 
complete cds 


3e-07 


3881075 


(Ai-u,u<o /) predicted using 
Genefinder; similar to DnaJ 
domain ; Thioredoxin; cDNA 
EST yk433f3.5 comes from this 
gene; cDNA EST 
EMBL:D32359 comes from this 
gene; cDNA EST 
EMBL:D34721 comes from this 
gene; cDNA EST yk433f3.3 c... 


3e-09 


1120 


X69838 


H.sapiens mRNA for 
G9a 


3e-07 


3873414 


(U00043) similar to D. 
melanogaster trithorax protein 


3e-29 


1121 


AB011124 


Homo sapiens mRNA 
for KIAA0552 
protein, complete cds 


2e-07 


2618749 


(U90880) hypothetical protein 
2; predicted using XGrail 


2.0 


1122 


K03012 


Human cellular fms 
proto-oncogene, 
partial cds. 


le-07 


<NONE> 


<NONE> 


<NONE> 


1123 


AB016195 


Homo sapiens DNA f 
microsatellite and Alu 
repeat region 


le-07 


728837 


!!!! ALU SUBFAMILY SQ 
WARNING ENTRY 


0.095 


1124 


Y I 6795 


Homo sapiens 
psihHaA pseudogene 


4e-08 


<NONE> 


<NONE> 


<NONE> 


1125 


ABO 12624 


Homo sapiens FLU 
gene for ERGB 
transcription fuctor, 
intron 4 and partial 
cds 


4e-08 


728836 


!!!! ALU SUBFAMILY SP 
WARNING ENTRY 


3.6 


1126 


AJ131341 


Homo sapiens oggl 
gene, exons 1-7 


4e-08 


113668 


!!!! ALU CLASS C WARNING 
ENTRY !!!! 


3e-05 


1127 


L81902 


Homo sapiens 
(subclone l_cl0from 
PI H69) DNA 
sequence 


3e-0S 


4225950 


(AJ132701) centaurin gamma IB 


1.8 


112S 


Y 17968 


Gallus eallus mRNA 
for high mobility 
group 1 protein 


3e-0S 


3041855 


(AC004537) similar to tumor 
suppressor p33INGl; similar to 
AF044076 (PID:g282920S) 
Homo sapiens] 


3e-31 


1129 


Y13901 


Homo sapiens FGFR- 
4 gene 


le-08 


<NONE> 


<NONE> 


<NONE> 



fit- 
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Nearest 


Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins* 


SEQ 
ID 


> 

ACCESSICK 


4 DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 


1130 


L22024 


Mesocricetus auratus 
serum amyloid P 
component gene, 
complete cds. 


le-08 


<NONE> 


<£N\J[Nt> 


<NONE> 


1131 


AF0I2899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


le-08 


<NONE> 


<NONE> 


<NONE> 


1132 


X 14034 


Human mRNA for 
phospholipase C > :: 
gb|M37238|HUMPL 
C Human 
phospholipase C 
mRNA, complete cds. 


le-08 


<NONE> 


• <NONE> 


<NONE> 


1133 


Z59381 


H.sapiens CpG DNA, 
clone 152M0. 
forward read 
cpgl52bl0.ftla. 


le-08 


<NONE> 


<NONE> 


<NONE> 


1134 


L81839 


Homo sapiens 
(subclone 2Ji3 from 
PI H43) DNA 
sequence 


le-08 


<NONE> 


<NONE> 


<NONE> 


1135 


X14448 


Human GLA gene for 
alpha-D-galactosidase 
A (EC 3.2.1.22) 


le-08 


3334427 


HYPOTHETICAL PROTEIN ™ 
MJ1207 Methanococcus 
jannaschii >gi| 1591837 
;U67562) protease synthase and 
^{jui uiiiiiun negative regulator 
Pail, putative [Methanococcus 
jannaschii] 


9.1 


1136 


< 
< 

AL023774 


Pluman DNA 
sequence from clone 
799F15 on 
:hromosome Xq25, 
:ompIete sequence 
Homo sapiens] 


le-08 


( 

1354935 t 


U58330) probable copper- 
ransportins atpase 


1.2 


1137 


r 
s 

X64639 s 


-(.sapiens DNA 
epetitive 
ubte!omeric-like 
equence (522 bp) 


le-08 


y 

77356 e 


lypothetieal 70K protein - 
ggplant mosaic virus 


0.098 


1138 


I 

U97058 5 


iuman HuD gene, 
'UTR 


5e-09 


( 

3387886 s 


AF070530) unknown [Homo 
apiens] 


9.5 
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Nearest Neighbor (BlasiN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


r DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Human UNA 










1139 


Z82181 


sequence from 
cosmid E86D10on 
chromosome 22. 
contains ESTs 
exontrap, complete 
sequence 


5e-09 


728831 


!!!! ALU SUBFAMILY J 
WARNING ENTRY 


8.4 


1140 


AJ006587 


Mus musculus mRNA 
for translation 
initiation factor eIF2 
gamma X 


Se-OQ 




(U22376) alternatively spliced 


U.04 


1141 


Y11I08 


H.sapiens WNT8B 
gene 


4e-09 


2854198 


(AF045646) contains similarity 
to collagens 


4.0 


1142 


AE001223 


Treponema pallidum 
section 39 of 87 of 
the complete genome 


4e-09 


3334189 


CELL DIVISION PROTEIN 
FTSY HOMOLOG 


1.5 


1143 


Z47046 


Human cosmid 
QLL2C9 from Xq28 


4e-09 


104045 


fibroblast growth factor receptor 
Al precursor - African clawed 
frog >gi|2 14894 (M55163) 
fibroblast growth factor receptor 
[Xenopus laevis] 


1.3 


1144 


AG000746 


Homo sapiens 
genomic DNA, 21q 
region, clone: 
T171Bm40 


4e-09 


113666 


!!!! ALU CLASS A WARNING 
ENTRY !!!! 


0.33 


1145 


• 

M74002 


rluman arginine-rich 
nuclear protein 
tiRNA, complete cds. 


4e-09 


3875371 


^z-jur'+ij; Lurrmiii - j vjimc anu — 
argininc rich domain, possesses 
weak similarity with the RNA 
binding domains from RNA 
splicing factor U2AF 65 KD 
subunit; cDNA EST 
EMBL:D64658 comes from this 
gene; cDN A EST 
EMBL:D66829 comes f... 
>gi|3878699|gnl|PID|e 135 1700 
possesses weak similarity with 
the RNA binding domains from 
RNA splicing factor U2AF 65 
KD subunit; cDNA EST 
EMBL:D64658 comes from this 
eene; cDNA EST 
EMBLD66829 comes f... 


3e-06 


1146 


] 
] 

U95094 c 


Kenopus laevis XL- 
NCENP (XL. 
NCENP) mRNA. 
romplete cds 


2e-09 


1 

2494337 I 


ENDO- 1,4-BETA-XYLANASE 
PRECURSOR sp.] 


4.9 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












UUP- 




1147 


U20554 


Drosophita 
melanogaster UDP- 
glucose: glycoprotein 
gl ucosy 1 transferase 
mRNA. complete cds. 


2e~09 


2499087 


ULUCOSE:ULYCOPROTEIN 
GLUCOSYLTRANSFERASE 
PRECURSOR (DUGT) 
glucosyl transferase - fruit fly 
(Drosophila sp.) 
glucosyltransferase precursor 
[Drosophila melanogaster] 


4e-24 


1148 


Z56162 


H. sapiens CpG DNA, 
clone 91c9, forward 
read cpg91c9.ftia . 


le-09 


' ' <NONE> 


<NONE> 


<NONE> 


1L49 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-09 


1002424 


(U25739) YSPL-1 form 1 [Mus 
musculusl 


8.9 


1150 


M85276 


Homo sapiens NKG5 
gene, complete cds. 


le-09 


2315436 


(AF016447) No definition line 
found [Caenorhabditis elegans] 


8.3 


1151 


M94065 


Human 

dihydroorotate 
dehydrogenase 
mRNA, 3' end. 


le-09 


3892656 


(AB014464) MGC-24V [Mus 
musculus] 


6.2 


1152 


AJ131895 


Homo sapiens 
genomic CAG repeat 
element, clone 
60o2(250) 


5e-10 


<NONE> 


<NONE> 


<NONE> 


1153 


282181 


Human UNA 
sequence from 
cosmid E86D10on 
chromosome 22. 
contains ESTs, 
exontrap, complete 
sequence 


5e-10 


728831 


!!!! ALU SUBFAMILY J 
WARNING ENTRY 


7.9 


1154 


AJ224442 


Homo sapiens mRNA 
for putative 
methyltransferase 


5c- 10 


113667 


! ! ! ! ALU CLASS B WARNING 
ENTRY !!!! 


0.15 


1155 


AJO 10230 


Homo sapiens RET 
finger protein- 1 ike I 
antisense transcript, 
partial 


5e-10 


728834 


! ! ! ! ALU SUBFAMILY SB2 
WARNING ENTRY 


0.006 


1156 


AF111U6 


Homo sapiens 
silencer of death 
domains (SODD) 
mRNA. complete cds 


5e-10 


4160014 


(AFIU116) silencer of death 
domains [Homo sapiens] 


2e-08 


1157 


Z97017 


Homo sapiens mRNA 
for hypothetical 
protein 


4e-l0 


<NONE> 


<NONE> 


<NONE> 
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Nearest Neiehbor (BlastN vs, Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant ProteinsJ 


ID 


ACCESSION 


I DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Homo sapiens type 11 










1158 


AF001298 


integral membrane 
protein 


4e-10 


<NONE> 


<NONE> 


<NONE> 


1159 


Y11395 


H.sapiens mRNA for 
p40 


2e-10 


1000340 


(U34384) CheW (Borrelia 
burgdorferi! 


2.4 


1160 


U41096 


Human non-coding 
sequence upstream 
from DOC -2 gene on 
chromosome 5 


2e-10 


728837 


!!!! ALU SUBFAMILY SQ 
WARNING ENTRY 


0.28 


1161 


AF012899 


Sambucus nigra 
ri bo some inactivating 
protein precursor 
mRNA, complete cds 


6e-l*l 


<NONE> 


<NONE> 


<NONE> 


1162 


Z36111 


S. cere vis iae 
chromosome II 
reading frame ORF 
YBR242w 


6e-ll 


2213560 


(Z97052) hypothetical protein 


3e-27 


1163 


D89174 


Schizosaccharomyces 
pombe mRNA, partial 
cds, clone: SY 1004 


6e-ll 


3879758 


Similarity to yeast 
protein TREMBL ED E246895); 
cDNA EST EMBL:T00018 
comes from this gene; cDNA 
ESTEMBL:C13908 comes 
from this gene; cDNA EST 
EMBL:C 11656 comes from this 
gene;cDNA EST yk234a5.3 


*re-JU 


1164 


Z95437 . 


Human UNA 
sequence from 
cos mid Al on 
chromosome 6 
contains ESTs. 
HERV like retroviral 
sequence 


5c- 11 


<NONE> 


<NONE> 


<NONE> 


1165 


AF0I2899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


5e-ll 


3886065 


(API 06581) contains similarity 
o C4-type zinc fingers 


4.9 


1166 


] 
( 

X56997 j 


Human UbA52 gene 
:oding for ubiquitin- 
)2 amino acid fusion 
jrotein 


2e-ll 


<NONE> 


<NONE> 


<NONE> 


1167 


1 

! 

AF086253 c 


-lomo sapiens full 
ength insert cDNA 
•lone 2D40G12 


2e-U 


2 

21347S0 


ipoptosis inhibitor IAP homolog 
human 


3.8 



9& 
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SEQ 
ID 


Nearest N 
ACCESSION 


eighbor (BlastN vs. Ce 
DESCRIPTION 


nbank) 
P VALUE 


Nearest Neighbo 
ACCESSION 


r (BlastX vs. Non-Redundant Pro 
DESCRIPTION 


teins) 

P VALUE 


1168 


AB018314 


Homo sapiens mRNA 
for KIAA0771 
protein, partial cds 


2e-ll 


3024343 


P53-B1ND1NG PROTEIN 
53BP2 Bbp/53BP2 [Homo 
sapiens] 


2e-ll 


1169 


Z74972 


S.cerevisiae 
chromosome XV 
reading frame ORF 
YOR064c 


2c- 11 


3041855 


(AC004537) similar to tumor 
suppressor p33INGi; similar to 
AF044076 (PID:g2829208) 
[Homo sapiens] 


2e-40 




Z82 181 


Human UNA 
sequence from 
cosmid E86D10on 
chromosome 22. 
contains ESTs, 
exontrap, complete 
sequence 


7c- 12 


<NONE> 


<NONE> 


<NONE> 


1171 


X77738 


H.sapiens red cell 
anion exchanger 
(EPB3. AE1, Band 3) 
gene, 3' region 


7e-12 


2135416 


hypothetical protein - human 
>2i|288145 


0.012 


1172 


S61977 


medium-chain acyl- 
CoA dehydrogenase 
{exon 10, intron 10} 
human. Genomic, 
1407 ml 


6c- 12 


113666 


! ! ! ! ALU CLASS A WARNING 
ENTRY!!!! 


0.L00 


1173 


X66285 


M.musculus DNA for 
HCl locus 


6c- 12 


854065 


(X83413)U88 [Human 
herpesvirus 61 


2e-06 


1174 


S78744 


protein S=activated 
protein C cofactor 
[rats, liver, mRNA, 
3315 nil 


6e-12 


2338292 


(AF009243) proline-rich Gla 
protein 2 [Homo sapiens] 


3e-l0 


1175 


X58474 


Bovine OXT gene for 
oxytocin, 5* 
noncoding region 


2e-12 


1296429 


(L77967) small proline-rich 
protein with paired repeat 


4.1 


1176 


Z56314 


H.sapiens CpG DNA, 
clone lOhlO, reverse 
read cpglOhlO.rtla . 


2c-12 


2935221 


(AF030154) pVII [bovine 
adenovirus type 31 


2.8 


1177 


Z56314 


H.sapiens CpG DNA, 
clone lOhlO, reverse 
readcpgl0hl0.rtla. 


2e-12 


2708659 


( AF037440) putative 26 kDa 
protein [Ed wards iel la ictaluri] 


2.8 


1178 


Z 19543 


M.musculus h2- 
calponin cDNA 


2e-12 


2497945 


BETA SCRUIN >gi| 101 5535 
(Z47541) beta scruin [Limulus 
polyphemusl 


2e-(U 



¥37 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






erythropoietin 










1179 


S45332 


receptor [human, 
placental, Genomic, 
8647 nil 


7e-l3 


728835 


!!!! ALU SUBFAMILY SC 
WARNING ENTRY 


0.074 


1180 


AFO 12899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


2e-13 


<NONE> 


<NONE> 


<NONE> 


1181 


AFO 12899 


r 

Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


2e-l3 


<NONE> 


<NONE> 


<NONE> 


1182 


Z59509 


H.sapiens CpG DNA, 
clone 1 5a I, reverse 
read cpglSal.rtla . 


2e-13 


3150251 


(AL023634) hypothetical 
protein 


0.66 


1183 


D1O170 


Human CYP11B2 
gene for steroid 18- 
hydroxylase 


2e-13 


728837 


!!!! ALU SUBFAMILY SQ 
WARNING ENTRY 


3e-05 


1184 


U65416 


Human MHC class I 
molecule (MICB) 
gene, complete cds 


2e-13 


126295 


LINE- 1 REVERSE 
TRANSCRIPTASE 
HOMOLOG 


6e-ll 


1185 


AJ006031 


Mus musculus 

IHABPgene, 

promoter 


8e-14 


2132223 


hypothetical protein YPL 1 86c - 
yeast 


1.1 


1186 


U34976 


Human gamma- 
sarcoglycan mRNA ( 
complete cds 


8e-14 


1054903 


(U34976) gamma-sarcoglycan 
Tlomo sapiens] >gi|4239660 
sapiens] 


0.034 


1187 


D30647 


Rat mRNA for very- 
long-chain Acyl-CoA 
dehydrogenase, 
complete cds 


8e-14 


3183512 


ACYL-COA 

DEHYDROGENASE, VERY- 
LONG-CHAIN SPECIFIC 
(VLCAD) >gi|2388724 
(AF017176) very- long-chain 
acyl-CoA dehydrogenase [Mus 
musculus] 


8e-23 


1188 


263247 


H.sapiens CpG DNA. 
clone 7g4, forward 
read cpe7g4.fla . 


6e- 14 


86285 


listone HI. 01 - chicken 


6.8 


1189 


U27196 


Gall us gallus zinc 
finger protein (Fzf-I) 
mRNA, complete cds. 


3e-14 


2134436 


tine finger protein - chicken 
fraamenO 


4e-l0 


1190 


M26219 i 


African green 
monkey origin of 
xplication | 


2e-14 


<NONE> 


<NONE> 


<NONE> 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Mus musculus 










1191 


AF100694 


Pontin52 mRNA, 
complete cds 


2e-l4 


4235641 


(AF 1 19040) NL0D 
[Lycopersicon esculentum] 


0.65 


1192 


AFO 12899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


2e-14 


3043728 


(AB01 1 174) KIAA0602 protein 
[Homo sapiens] 


0.28 


1193 


AJ005866 


Homo sapiens mRNA 
for putative Sqv-7- 
like protein, partial 


2e-14 


• 4008517 


(AJ005866) Sqv-7-iike protein 
[Homo sapiens] 


0.004 


1194 


U32709 


Haemophilus 
influenzae Rd section 
24 of 163 of the 
complete genome 


2c- 14 


3861056 


(AJ235272) 

POLYRIBONUCLEOTIDE 
NUCLEOTIDYLTRANSFERA 
SE (pnp) [Rickettsia 
prowazekii] 


6e-28 


1195 


AF073485 


Homo sapiens MHC 
class I-related protein 
MR I precursor 
(MR1) gene, partial 
cds 


8e-15 


728831 


!!!! ALU SUBFAMILY J 
WARNING ENTRY 


1.0 


1196 


AF052135 


Homo sapiens clone 
23625 mRNA 
sequence 


8e-l5 


4098124 


(U73522) AMSH [Homo 
sapiens] 


8e-14 


1197 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


3c- 15 


<NONE> 


<NONE> 


<NONE> 


1198 


AFO 12899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


3e-15 


113671 


! ! ! ! ALU CLASS F WARNING 
ENTRY !!!! 


1.7 


1199 


i 

Z75104 


S.cerevisiae 
:hromosome XV 
reading frame ORF 
VOR196c 


3e-15 


3873570 


(Z46381) similar to lipoic acid 
synthase; cDNA EST yk283b6.3 
comes from this gene; cDNA 
EST yk283b6.5 comes from this 
gene;cDNA ESTyk472f5.3 
comes from this gene; cDNA 
EST yk472f5.5 comes from this 
Bene; cDNA EST yk476e7.3... 


le-15 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












(U42833) coded tor by C. 




1200 


X70052 


S.cerevisiae sof 1 
gene 


3e-15 


1125754 


elegans cDNA cm!6f6; coded 
for by C. elegans cDNA 
CEESU63F; similar to S. 
cerevisiaeSOFl protein 
(SP:P33750) [Caenorhabditis 
elegans] 


3e-29 


1201 


AFO 12899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


2e-15 


' • <NONE> 


<NONE> 


<NONE> 


1202 


M92295 


Gorilla gorilla gamma 
1 and gamma-2 
glob in genes, 
complete cds. 


le-15 


284078 


hypothetical protein 2 - human 
>gi|182220 


7.4 


1203 


L34587 


Homo sapiens RNA 
polymerase II 
elongation factor SDL 
pl5 subunit mRNA, 
complete cds. > :: 
gb|AR022286|AR022 
286 Sequence 7 from 
patent US 5792634 


9e-i6 


<NONE> 


<NONE> 


<NONE> 


1204 


D83649 


Xenopus laevis 
mRNA for xSox7 
protein, complete cds 


8e-16 


2447043 


(D83649) xSox7 protein 
Xenopus laevis] 


4e-06 




AC005190 


Homo sapiens PAC 
clone DJ1I52D16 
from Xq23; complete 
sequence [Homo 
sapiens] 


3e-16 


<NONE> 


<NONE> 


<NONE> 


1206 


J03626 


Human UMP 
synthase mRNA, 
complete cds. 


3e-16 


113667 


! ! ! ! ALU CLASS B WARNING 
ENTRY !!!! 


0.65 


1207 


J00083 


Human Alu family 
interspersed repeat; 
clone BLUR11. 


3c 16 


728836 


!!!! ALU SUBFAMILY SP 
WARNING ENTRY 


4e-06 


1208 


U70674 


Mus musculus m- 
Numb(nvnb) mRNA, 
complete cds 


le-16 


<NONE> 


<NONE> 


<NONE> 
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Nearest Neighbor f BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non- Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1209 


U66619 


Human SWI/SNF 
complex 60 KDa 
subunit (BAF60c) 
mRNA, complete cds 


\cA6 


1549247 


(U66619) SWI/SNF complex 60 
KDa subunit [Homo sapiens] 


0.003 


1210 


U75467 


Drosophila 

melanogastcr Rga and 
Atu genes, complete 
cds 


le-16 


1658503 


(U75467) Alu [Drosophila 
melanogaster] 


5e-32 


1211 


M72709 


Human alternative 
splicing factor 
mRNA, complete cds. 


3e-17 


<NONE> 


<NONE> 


<NONE> 


1212 


U26556 


Human ferritin H 

(FTHL13) 

pseudogene. 


3e-17 


<NONE> 


<NONE> 


<NONE> 


1213 


D32064 


Human gene for 2- 
oxoglutarate 
dehydrogenase, 
complete cds 


3e-17 


2088843 


(AF003386) F59E12.9 gene 
product [Caenorhabditis 
elegansl 


0.12 


1214 


M76364 


Human (Papua New 
Guinean) 

Mitochondrial DNA 
control region, 
sequence 131. 


3e-l7 


114009 


aPaGPrOTKW 
>gi|72927|pir||BVECAG apaG 
protein - Escherichia coli 
>gi|40918(X047ll) URF 
hypothetical protein 
[Escherichia coli] 


0.006 


1215 


AFO 17466 


Homo sapiens 
genomic sequence 
from subtelomeric 
region of 
chromosome 4q 


le-17 


3947985 


(U78948) MADS-box protein 2 
[Malus domestical 


4.1 


1216 


AF004876 


Homo sapiens 
54TMp (54tm) 
mRNA, complete cds 


le-17 


4101574 


(AF004876) 54TMp [Homo 
sapiens] 


0.006 


1217 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


9c- 18 


<NONE> 


<NONE> 


<NONE> 


1218 


AF086758 


Rattus norvegicus Na- 
K-2CI cotransporter 


4e-18 


3892703 


(AL033545) putative glycine- 
rich protein [Arabidopsis 
thaliana] 


0.30 


1219 


AF020089 


Homo sapiens 
PEN 1 IB mRNA, 
complete cds 


4e-18 


2642493 


(AF023910) DNA 
topoisomerase I [Physarum 
pohcephalum] 


0.0S3 


1220 


X82333 


H.sapiens IRLB gene 
(exonl-3) 


4e-18 


106837 


irIB protein - human (fragment) 
>et|33969 


2c- 11 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Human mRNA for 










1221 


AB002383 


KIAA0385 gene, 
complete cds 


4e-18 


3228540 


(AF060181) zinc finger protein 
[Homo sapiens) 


6e-25 


1222 


X98485 


P.vivax PV14 gene 


le-18 


<NONE> 


<NONE> 


<NONE> 


1223 


Z79057 


H. sapiens flow-sorted 
chromosome 6 
Hindlll fragment, 
SC6pA21E8 


le-18 


2981631 


(AB012223) ORF2 [Canis 
familiaris] 


0.001 


1224 


L01457 


Homo sapiens (clone 
JH4Bl)PM-scl 
autoantigen mRNA, 
complete cds. 


le-18 


346287 


nucleolar 100K polymyositis- 
scleroderma protein - human 
>gi|35555 (X66113) PM/Scl 
lOOkD nucleolar protein [Homo 
sapiens] 


0.001 


1225 


L02897 


Dog nonerythroid 
beta-spec trin mRNA, 
3' end. 


4e-19 


3493358 


(ABO 1 7037) nonstructural 
protein precursor [Himetobi P 
virus] 


0.12 


1226 


AB012162 


Homo sapiens mRNA 
for APCL protein, 
complete cds 


4e-19 


3894265 


(AB012162) APCL protein 
[Homo sapiens] 


0.002 


1227 


AB011093 


Homo sapiens mRNA 
for KIAA0521 
protein, partial cds 


4e-19 


3043566 


(AB011093) KIAA0521 protein 
[Homo sapiens] 


9e-09 


1228 


X78454 


X.laevis AB21 
mRNA for RPD3 
homologue 


4e-19 


3023945 


HISTONE DEACETYLASE 
(HD) thaliana] 


5e-34 


1229 


U88895 


Human endogenous 
retrovirus H Dl 
leader 

region/integrase- 
derived ORF1, 
ORF2, and putative 
envelope protein 
mRNA, complete cds 


2e-19 


59977 


(Z143 10) tripartite fusion 
transcript PLA2L [Human 
endogenous retrovirus] 


le-04 


1230 


U34377 


Human tyrosine 
kinase TXK (txk) 
gene, exon 13. 


le-19 


728831 


!!!! ALU SUBFAMILY J 
WARNING ENTRY 


3e-05 


1231 


X72966 


Vl.musculus rab3A 
gene 


le-19 


2408076 


(Z99 167) putative peroxisomal 
organisation and biogenesis 
protein [Schizosaccharomyces 
pombe] 


2e-09 


1232 


AB007953 


Homo sapiens 
mRNA, chromosome 
1 specific transcript 
KIAA0484 


4e-20 


<NONE> 


<NONE> 


<NONE> 
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Nearest 


Neighbor (BlastN vs. Gen bank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


1 DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












(AB001535) similar to 




1233 


D 14034 


Human gene forZn- 
alpha2-gIycoprotein, 
complete cds 


2e-20 


3928756 


C.elegans hypothetical protein 

4D1.5. similar to trp and trp-like 
proteins [Homo sapiensl 


le-07 


1234 


X82126 


H.sapiens HOK-2 
gene, exon 2 


2e-20 


2137269 


DNA-binding protein - mouse 
>gi|437444 


Ie-I9 


1235 


AF093684 


Luciferase reporter 
vector pXP2 *SA,. 
complete sequence 


5e-21 


2773363 


(AF04I382) microtubule 
binding protein D-CLIP-I90 


5.5 


1236 


J05272 


Human IMP 
dehydrogenase type 1 
mRNA complete cds. 


5e-21 


124417 


INOSINE-5'- 

DEHYDROGENASE 1 (IMP 
DEHYDROGENASE I) 
(IMPDH-I) (IMPD 1) I - human 


2e-04 


1237 


D86997 


Human (lambda) 
DNA for 

immunoglobulin light 
chain 


5e-21 


3878261 


(275712) Similarity to S. Pombe 
BEM1/BUD5 suppressor; 
cDNA EST EMBL:Z14470 
comes from this gene; cDNA 
EST yk482d4.3 comes from this 
gene; cDNA EST yk482d4.5 
comes from this gene 
Caenorhabditis elezansl 


6e-46 


1238 


< 
i 
1 

279865 : 


H.sapiens 

:hromosome 22 CpG 
sland DNA genomic 
vlsel fragment, clone 
J02f3, forward read 
J02f3.f 


2e-2I 


< 
i 

2739037 i 


(AF024614) ADAM 10 
'Caenorhabditis elegans] Zinc- 
rinding metalloprotease domain; 
:DNA EST CEMSA42F comes 
from this gene; cDNA EST 
yk21Sf3.3 comes from this gene; 
:DNA EST yk443d9.3 comes 
rom this gene; cDNA EST 
fk443d9.5 comes from this 
:ene; cDNA... 


2.6 
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Nearest 


Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSIO 


I DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












(/TLwojoj; simnai iu uinumi 




1239 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


6e-22 




" C, tDNA E3T > UjOU3.j luuur 
from this gene: cDNA EST 
yk249a6.5 comes from this 
gene;cDNA ESTyk2I9a2.5 
comes from this gene; cDNA 
EST yk355e4.5 comes from this 
gene; cDNA EST yk224f4.5 
comes fr... 

>gi|392488 l|gnl|PID|e 1 354569 
from this gene; cDNA EST 
yk249a6.5 comes from this 
gene; cDNA EST yk219a2.5 

cnmpc from fhic opiip* />r)NA 

EST yk355e4.5 comes from this 
gene; cDNA EST yk224f4.5 
comes from... 


0.35 


L240 


U67824 


Human primary Alu 
transcript 


6e-22 


728832 


!!!! ALU SUBFAMILY SB 
WARNING ENTRY 


5e-07 


1241 


AF070636 


Homo sapiens clone 
24686 mRNA 
sequence 


2e-22 


98710 


fatty-acid synthase (EC 
£..j.i.oj) - oreviDacterium 
ammoniaeenes 


2.5 


1242 


D 14034 


Human gene for Zn- 
alpha2 -glycoprotein, 
complete cds 


2e-22 


4185939 


(Y17832) pol protein [Human 
endogenous retrovirus Kl 


0.29 


1243 


M61835 


Human lactase 
phlorizin hydrolase 
(LCT) gene, exon 2. 


2e-22 


728831 


!!!! ALU SUBFAMILY J 
WARNING ENTRY 


0.006 


1244 


AF100694 


VI us musculus 
Pontin52 mRNA, 
complete cds 


6e-23 


1350828 


RABPHILIN-3A 
>gi|477I00|pir||A48097 
rabphilin-3A - bovine 
>gi|285646|gnl|PID|d KXWSS 


0.14 


1245 


AF074985 


iomo sapiens full 
ength insert cDNA 
VH73H06 


8e-24 


3170548 i 


[AF0561 16) unknown [Fugu 
nbripes] 


0.24 


1246 


] 
I 

D14878 c 


tfuman mRNA for 
arotein DI23, 
•omplete cds 


7e-24 


<NONE> 


<NONE> 


<NONE> 


1247 


} 

r 

D16917 h 


■luman HepG2 3" 
egion cDNA, clone 
md3d07 


6e-24 


( 
r 
( 

( 

1397345 e 


U61955) contains multiple 
egion of strong similarity to 
32H2-type zinc fingers 
PS:PS0002S) [Cacnorhabditis 
legans] 


2.4 
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Nearest Neighbor f BlastN vs. Genbank) 


Nearest Neighbor (BlasiX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Human DNA 










1248 


269654 


sequence from 
cosmid L98A6. 
Huntington's Disease 
Region, chromosome 
4pl6.3. 


3e-24 


4240566 


(AF 1 23462) neurexin HI [Homo 
sapiens] 


4.5 


1249 


AB007914 


Homo sapiens rnRNA 
for KIAA0445 
protein, complete cds 


2e-24 


3885949 


(AF095568) amelogenin 
[Paleosuchus palpebrosus] 


3.2 


1250 


AF088072 


Homo sapiens full 
length insert cDNA 
clone 2D93D 10 


2e-24 


323091 


immunodominant microneme 
protein EtplOO - Eimeria tenella 
>gi|2707733 (AF032905) 
microneme protein precursor 
Etmic-I [Eimeria tenella] 


0.34 




AF069489 


Homo sapiens cAMK 
specific 

phosphodiesterase 4A 
variant pde46 
(PDE4A) gene, exons 
2 through 1 3 and 
alternative splice 
exons 3a. 6a, 6b, and 
9a 


2e-24 


728836 


!!!! ALU SUBFAMILY SP 
WARNING ENTRY 


le-05 


1252 


Y 12853 


Homo sapiens P2X7 
gene, exon 4-8 


9e-25 


728831 


!!!! ALU SUBFAMILY J 
WARNING ENTRY 


le-05 


1253 


M27830 


Human 2SS 
ribosomal RNA gene, 
complete cds. 


8e-25 


<NONE> 


<NONE> 


<NONE> 




AB007953 


Homo sapiens 
rnRNA, chromosome 
1 specific transcript 
KIAA04S4 


8e-25 


<NONE> 


<NONE> 


<NONE> 


1255 


Z60212 


n. sapiens upu UNA, 
clone 195c8, forward 
read cpgl95c8.ftla . 


8e-25 


158154 


(M81959) POU domain protein 
[Drosophila melanogaster] 


3.3 


1256 


AF100694 


Vlus musculus 
Pontin52 rnRNA, 
:omplete cds 


7e-25 


<NONE> 


<NONE> 


<NONE> 


1257 


AF 100694 < 


Vlus musculus 
Pontin52 rnRNA, 
;omp!ete cds 


7e-25 


<NONE> 


<NONE> 


<NONE> 


1258 


Y12851 


-lomo sapiens P2X7 
jene, exon 1 and 
oined CDS 


2e-25 


<NONE> 


<NONE> 


<NONE> 



¥45 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Mus musculus Tera 










1259 


U64033 


(Tera) mRNA, 
complete cds 


9e-26 


<NONE> 


<NONE> 


<NONE> 


1260 


U19181 


Rattus norvegicus 
Rabin3 mRNA. 
complete cds. 


9e-26 


624225 


(U19I81)Rabin3 [Rattus 
norvegicus] 


le-13 


1261 


AF020788 


Caenorhabditis 
elegans SEL-I0 (sel- 
10) mRNA, complete 
cds 


9e-26 


3915881 


atL- iu I'ku i tiiN Candida 
CDC4 gene (TR:E234056); 
cDNA EST EMBL:D27699 
comes from this gene; cDNA 
EST EMBL:D27698 comes 
from this gene; cDNA EST 
EMBL:D32793 comes from this 
gene; cDNA EST 
EMBL:D3327l comes from this 
gen... 


* 

7e-32 


1262 


AB016930 


Cricetulus griseus 
mRNA for 

Phosphatidylglycerop 
hosphate synthase, 
complete cds 


8e-26 


4159682 


(ABO 16930) 

Phosphatidyl glycerophosphate 
synthase [Cricetulus griseus] 


0.045 


1263 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


3e-26 


3878629 


(Z93385) predicted using 
Genefinder; Similarity to 
B.subtilis GTP-binding protein 


2e-10 


1264 


X9U95 


H.sapiens SOM172 
mRNA 


le-26 


<NONE> 


<NONE> 


<NONE> 


1265 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-26 


1360637 


(X95995) ENBP1 FVicia satival 


3.1 


1266 


L08237 


Human MG2 1 
mRNA. partial cds. 


le-26 


950411 


(L08237) located at OATL1 
[Homo sapiens] 


9e-09 


1267 


AF 100694 < 


Mus musculus 
Pontin52 mRNA, 
:omplete cds 


9e-27 


3881080 


(AL032657) similar to EGF-like 
domain; cDNA EST yk299al2.3 
comes from this gene; cDNA 
EST EMBL:D35398 comes 
from this gene; cDNA EST 
yk331h6.5 comes from this 
gene; cDNA EST yk299al2.5 
comes from this gene; cDNA 
EST yk467.eS.... 


0.00 1 


1268 


] 
] 

AF100694 |( 


Vtus musculus 
Pontin52 mRNA, 
rompleie cds 


8e-27 


] 

1731324 : 


HYPOTHETICAL PROTEIN 
>gi| 166306 


4.0 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Pre 


>teins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


D V A T T rC 

r VAJLUfc 
















1269 


X892U 


H.sapiens DNA for 
endogenous retroviral 
like element 


8e-27 


2065209 


(Y12713) Gag polyprotein [Mus 
musculus] 


0.005 


1270 


U73166 


Homo sapiens cosrnid 
clone LUCA 15 from 
3p2 1.3, complete 
sequence [Homo 
sapiens] 


3e-27 


728831 


!!!! ALU SUBFAMILY J 
WARNING ENTRY 


4e-04 


1271 


D78255 


Mouse mRNA for 
PAP-1. complete cds 


3e-27 


1850098 


(D78255) PAP- 1 [Mus 
musculus] 


2e-10 


1272 


AF 100694 


Mus musculus 
Pontin52 mRNA t 
complete cds 


le-27 


2133579 


spermatophorin Sp23 - yellow 
mealworm mo li tori 


0.39 


1273 


ABO 15202 


Homo sapiens gene 
for hippocalcin, exon 
2, 3 and complete cds 


le-27 


3877698 


(283318) predicted using 
Genefinder; cDNA EST 
yk369e7.5 comes from this gene 
[Caenorhabditis elegans] 


0.37 


1274 


AF 100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


le-27 


3328188 


(AF074902) laminin alpha chain 
[Caenorhabditis elegans] 


0.19 


1275 


Z29336 


H.sapiens gene for 
Cu/Zn-superoxide 
dismutase 


le-27 


728831 


!!!! ALU SUBFAMILY J 
WARNING ENTRY 


6e-05 


1276 


AFI00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


9e-28 


2133579 


spermatophorin Sp23 - yellow 
mealworm molitor] 


9.2 


1277 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


9e-28 


2133579 


spermatophorin Sp23 - yellow 
mealworm molitor] 


0.054 


1278 


AB0O1636 


Homo sapiens mRNA 
for ATP-dependent 
RNA helicase #46, 
complete cds 


4e-28 


3913425 


PUfATlvt- wlE-MRNA 
SPLICING FACTOR ATP- 
DEPENDENT RNA 
HELICASE >gi|2275203 
(AC002337) RNA helicase 
isolog [Arabidopsis thaliana] 


3e-22 


1279 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


3e-28 


4056454 


(AC(5b5990j Contains repeated 
region with similarity to 
gb|U43627 cxtensin(aiExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|2 18788 
come from this gene. 
[Arabidopsis thaliana] 


0.066 
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Nearest Neighbor (BlasiN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












(ACOU^yyU) Contains repeated 




1280 


AF100694 


Mus muscuius 
Pontin52 mRNA, 
complete cds 


3e-28 


4056454 


region with similarity to 
gb|U43627 extensin (aiExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z 18788 
come from this gene. 
[Arabidopsis thaliana] 


4e-05 


1281 


API 00694 


Mus muscuius 
Pontin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1282 


API 00694 


Mus muscuius 
Pontin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<N0NE> 


1283 


AF 100694 


Mus muscuius 
Pontin52 mRNA, 
complete cds 


Ie-28 


<NONE> 


<NONE> 


<N0NE> 


1284 


API 00694 


Mus muscuius 

Da.»:.C) —DMA 

complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1285 


Ar 1 Uuov4 


Mus muscuius 
Pontin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1286 


Ar lUUoy4 


Mus muscuius 
Pontin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1287 


AF100694 


Mus muscuius 
Pontin52 mRNA, 
complete cds 


le-28 


140505 


PROBABLE INTRON 
MATURASE liverwort 
(Marchantia polymorpha) 
ch!oropIast>gi| 11663 


3.0 


1288 


AF 100694 


Mus muscuius 
rontiro/ nxKJN A, 
complete cds 


le-28 


140505 


PROBABLE INTRON 
MATURASE liverwort 
(Marchantia polymorpha) 
chloroplast >gi 1 1 1 663 


1.8 


1289 


AF 100rt94 


Mus muscuius 
Pontin52 mRNA, 




X 1 JJJ tzf 


spermatophorin Sp23 - yellow 
mealworm molitor] 


U.JU 


1290 


AF 100694 


Mus muscuius 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(ACUU^yyO) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z 18788 
come from this gene. 
Arabidopsis thaliana] 


0.0S7 


1291 


Z63029 


hi. sapiens CpG DNA, 
clone 77b3, forward 
read cpg77b3.ftla . 


le-28 


2493240 


HYPOTHETICAL 29.3 KD 
PROTEIN pseudotsugata 
nuclear, polyhedrosis virus] | 


0.014 



m 
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Nearest Neighbor fBlasiN vs. Gjnbank) 1 Npnr«t IWhhnr fRI^rx" v< \r^_p HimH - rr PrflIf . in O 


SEC 
ID 


) 

ACCFSSTnr* 




P VALUE 1 ACCESSION 


DESCRIPTION 


P VALUE 










DEHYDR1N DHN3 




1292 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 118588 


>gi|100035|pir||S 18139 dehydrir 
DHN3 - garden pea >gi|20709 
(X63063) pea dehydrin DHN3 
[Pisum sativuml 


l 

0.010 


1293 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 1 " 4056454 


(Auuu^yyO) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34l65 and gb|Z 18788 
come from this gene. 
[Arabidopsis thaliana] 


0.007 


1294 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 4056454 


(AL0U0990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z18788 
come from this gene. 
[Arabidopsis thaliana] 


0.002 


1295 


AF 1 00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 126363 


LAMININ ALPHA- 1 CHAIN 
PRECURSOR precursor - 
human 


3e-04 


1296 
1297 


AF 1 00694 


Mus musculus 
rontin52 mRNA, 
complete cds 


le-28 J 4056454 


(Acuusyyu) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTsgb|Z34165 and gb|Z18788 
come from this gene. 
Arabidopsis thaliana] 


le-04 


AF100694 


VI us musculus 
Pontin52 mRNA, 
:omplete cds 


le-28 1 4056454 


(ACUUDyyO) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana, 
ESTs gb|Z341 65 and gb|Z 18788 
*ume from mis gene. 
'Arabidopsis thaliana] 


3e-05 


1298 


] 
J 

AF1 00694 c 


VIus musculus 
> ontin52 mRNA, 
romplete cds 


1 

I t 

i 

le-28 3157926 t 


AC002131) Strong similarity to 
ixcensin-like protein gb|Z34465 
rom Zea mays. [Arabidopsis 
haliana] 


2e-05 


1299 


t 
F 

AF 1 00694 c 


Aus musculus 
> ontin52 mRNA, 
omplete cds 


( 

1 r 

I i 
1 c 

I 

1 c 

le-28 | 4056454 f 


AUUO^vyU) Contains repeated 
egion with similarity to 
tb|U43627 extensin (atExtl) 
:ene from Arabidopsis thaliana. 
:STs gb|Z34165 and gb|Zl8788 
ome from this gene. 
Arabidopsis thaliana] 


le-05 
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SEQ 
ID 


Nearest 
ACCESSIOf* 


Neighbor (BlastN vs. 

J DF^PRrPTTOM 
* t.o\^f\. i-i i tun 


Genbank) 

o \f a i r it" 
r VALUE 


Nearest Neieh 
ACCESSION 


bor (BlastX vs. Non-Redundant P 
DESCRIPTION 


roteins) 
P VALUE 


13001 AFI00694 


Mus m use u J us 
Pontin52 mRNA, 
complete cds 


le-28 


320919 


kinetoplast-associated protein - 
Trypanosoma cruzi >gi|162142 
(M25364) kinetoplast-associatee 
protein 


i 

le-07 


1301 1 AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(ACwoyyU) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z 18788 
come from this gene. 
[Arabidopsis thalianal 


9e-08 


1302 AF100694 


Mus musculus ' 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(ACUU^yyo) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|234I65 and gb|Z 18788 
come from this gene. 
[Arabidopsis thaliana] 


le-09 


13031 AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(ACUU^yyO) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z18788 
come from this gene. 
[Arabidopsis thaliana] 


9e-10 


1304 AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(ALUO^yyO) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb[Z 18788 
come from this gene. 
; Arabidopsis thaliana] 


4e-10 


] 
] 

1305 AF10O694 < 


VIus musculus 
^ontin.52 mRNA, 
romplete cds 


le-28 


J 
1 

£ 

4056454 


tALUU^yyU) Contains repeated 
region with similarity to 

dhlT F4T^77 PYtoncSn 

gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z!8788 
;ome from this gene. 
Arabidopsis thaliana] 


9e-Il 


1 * 

F 

1306 1 AF100694 c 


Aus musculus 
> ontin52 mRNA, 
omplete cds 


le-28 


r 

I 
c 

4056454 f 


AeuUDyyu) Contains repeated 
egion with similarity to 
ib|U43627 extensin (atExtl) 
tene from Arabidopsis thaliana. 
:STs gb|Z34l65 and gb|Z1878S 
ome from this gene. 
Arabidopsis thaliana] 


6e-ll 



1%) 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ED 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Mus muse ul us 










1307 


Ar 100694 


Pontin52 mRNA. 
complete cds 


4e-29 


<NONE> 


<NONE> 


<NONE> 


1308 




Homo sapiens cAMP- 
specific 

phosphodiesterase 8B 




<NUNfc> 


<NONE> 


WT/"\KTT7«^ 

<NUNb> 


IJU7 


X93334 


H. sapiens 

mitochondrial JJNA, 
complete genome 


4e-29 


116977 


CYTOCHROME C OXIDASE 
POLYPEPTIDE I chain I - 
human mitochondrion (SGC1) 
>gi|i jwo \ vuuooz; cyiocnrome 
oxidase I [Homo sapiens] 
>gi|506829 (JO 1 4 15) 
cytochrome oxidase subunit 1 
[Homo sapiens] sapiens] 


3e-09 


1310 


AF020760 


Homo sapiens serine 
protease (Omi) 
mRNA, complete cds 


4e-29 


2738915 


(AF020760) serine protease 
[Homo sapiens] 


8e-12 


1311 


U95097 


Xenopus laevis 
mitotic 

phosphoprotein 43 
mRNA, partial cds 


4e-29 


2072294 


(uyjuy/; mitotic 
phosphoprotein 43 [Xenopus 
laevis] 


le-25 


1312 


L32162 


Homo sapiens 
transcription factor 
mRNA, 5' end. 


2e-29 


2501706 


RENAL TRANSCRIPTION 
FACTOR KID-1 finger protein 
[Mus musculus] 


8e-15 


1313 


AF100694 


Mll<? muse ul us 
Pontin52 mRNA» 
complete cds 


le-29 ' 


4056454 


region with similarity to 
gb|U43627 extensin (atExti) 
gene from Arabidopsis thaliana. 
FSTs obl714I65 and oh\Z 18788 
come from this gene. 
[Arabidopsis thaliana] 


le-04 


1314 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-29 


1169643 


FMftFAMID£-R^LAte£) 
NEUROPEPTIDES 
PRECURSOR >gi|416208 
(U03137) neuropeptide 
precursor FMRFamide-related 
peptide [Lymnaea stagnalis] 


Ie-05 


1315 


U50839 


Homo sapiens gl6 
protein (gl6) mRNA, 
complete cds 


le-29 


3212101 


(AF069517) RNA binding 
protein DEF-3 (Homo sapiens] 


6e-l0 



4*1 
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Nearest Neighbor (BiastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












intercellular adhesion molecule 




1316 


X697 i 1 


H. sapiens mRNA for 
ICAM-R 


5e-30 


299356 


3, ICAM-3=lymphocyte 
function-associated antigen 1 
counter-receptor homolog 
[human, tonsil. Peptide Partial, 
518 aa] 


3e-08 


1317 


AFO 10227 


Homo sapiens 
receptor-associated 
coactivator 3 


5e-30 


2331250 


(AF012108) Amplified in Breast 
Cancer [Homo sapiens] 


8e-09 


1318 


AF086395 


Homo sapiens full 
length insert cDNA 
clone ZD75C01 


2e-30 


3861241 


(AJ235273) CELL SURFACE 
ANTIGEN (sca5) 


4.2 


1319 


M27830 


Human 28S 
ribosomal RNA gene, 
complete cds. 


2e-30 


1730522 


PHOSPHOGLYCERATE 
KINASE 2.7.2.3) - Pyrococcus 
woesei >gi| 1054832 (X73527) 
phosphoglycerate kinase 
[Pyrococcus woesei] 


3.8 


1320 


M79307 


Mouse GTP-binding 
protein (Rabl7) 
mRNA sequence. 


2e-30 


464564 


RAS-RELATED PROTEIN 
RAB-17 Rabl7 - mouse 
(fragment) >gi|297157 
(X70804) rab!7 [Mus musculus] 


9e-ll 


1321 


AL022168 


Human DNA 
sequence from clone 
U247E12on 
chromosome Xq22- 
23, complete 
sequence [Homo 
sapiens] 


le-30 


2072967 


(U93570) putative pl50 [Homo 
sapiens] 


3e-ll 


1322 


X85124 


M.musculus pacsin 
pene 


le-30 


2217964 


(Z50798) p52 [Gallus gallus] 


le-34 


1323 


U37408 


Homo sapiens 
phosphoprotein CtBP 
mRNA. complete cds 


5e-31 


74518 


structural polyprotein - 
Venezuelan equine encephalitis 
virus (strain TRD) >gi|323710. 
(J04332) poly-envelope protein 
[Venezuelan equine encephalitis 
virus] 


1.1 


1324 


L04193 


Human lens 
membrane protein 
(mpl9) gene, exon 
11. 


2e-3l 


728831 


!!!! ALU SUBFAMILY J 
WARNING ENTRY 


7e-07 


1325 


M11167 


Human 28S 
ribosomal RNA gene. 


6e-32 


<NONE> 


<NONE> 


<NONE> 
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Nearest Neighbor f BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1326 


M33336 


Human cAMP- 
dependent protein 
kinase type I-alpha 
subunit (PRKAR1A) 
mRNA, complete cds 


2e-32 


<NONE> 


<NONE> 


<NONE> 


1327 


JO306O 


Human 

glucocerebrosidase 
pseudogene, complete 
cds 


2e-32 


2144479 


glucosylceramidase (EC 
3.2.1.45) precursor - human 


le-05 


1328 


U33053 


Human lipid- 
activated protein 
kinase PRK1 mRNA, 
complete cds 


7e-33 


2137689 


protein kinase (EC 2.7.1.37) - 
mouse 


le-14 




J046I7 


Human elongation 
factor EF-1- alpha 
gene, complete cds. > 
:: dbj|E02629|E02629 
DNA of human 
polypeptide chain 
elongation factor- 1 
alpha 


6e-33 


<NONE> 


<NONE> 


<NONE> 


1330 


L40396 


Homo sapiens (clone 

s22i71)mRNA 

fragment 


6e-33 


124235 


INTERMEDIATE FILAMENT 
PROTEIN B protein B - 
common roundworm 


1.00 


i n i 


272813 


S.cerevisiae 
chromosome VII 
reading frame ORF 
YGR028w 


6e-33 


1709135 


MSP1 PROTEIN HOMOLOG 
Yeast MSP 1 protein (TAT- 
binding homolog 4) 


8e-50 


1332 


AB0O7941 


Homo sapiens mRNA 
for KIAA0472 
protein, partial cds 


2e-33 


1 150834 


(TJ42471) Wiscott-AIdrich 
Syndrome protein homolog 
[Mus musculus] 


2.0 


1333 


AF044574 


putative peroxisomal 
2,4-dienoyl-CoA 
reductase (DCR- 
AKX) mRNA, 
complete cds 


2e-34 


4105269 


(AF044574) putative 
peroxisomal 2,4-dienoyl-CoA 
reductase (Rattus norvegicus] 


6e-15 


1334 


D14657 


tfuman mRNA for 
KIAA0I0! gene, 
complete cds 


7e-35 


<NONE> 


<NONE> 


<NON*E> 


1335 


X69910 


H.sapiens p63 mRNA 
: or transmembrane 
protein 


7e-35 


2136323 


tri thorax homolog HTX - human 
[fragment) homolog=MLL 
[alternative splicing, clone I4p- 
18B) 


0.94 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Homo sapiens 










1336 


AF053455 


tetraspan TM4SF 
(TSPAN-5) gene, 
complete cds 


7e-35 


3152703 


(AF065389) tetraspan NET-4 
[Homo sapiens] 


le-25 


1337 


X58374 


D.mclanogaster cm 
mRNA 


3e-35 


117478 


CROOKED NECK PROTEIN 


6e-41 


1338 


AF086492 


Homo sapiens full 
length insert cDNA 
clone ZD95D11 


9e-36 


2909809 


(AF031328) aminoglycoside 6'- 
N-acetyltransferase It 


1.9 


1339 


Z96223 


H. sapiens telomeric 
DNA sequence, clone 
l2PTEL120,read 
l2PTELOO120.seq 


3e-36 


2408068 


(Z99165) hypothetical protein 


0.61 


1340 


Z37986 


H.sapiens mRNA for 
phenylalkylamine 
binding protein. 


le-36 


1362793 


emopamil-binding protein - 
human >gi|780263 


5e-ll 






Human ribosomal 
protein S27 mRNA, 
complete cds. end 
similar to similar to 
metal lopanstimulin 1 










1341 


U57847 


> :: 

gb|AA31 6327| AA3 16 
327 EST188061 HCC 
cell line (matastasis to 
liver in mouse) II 
Homo sapiens cDNA 
5* end similar to 
similar to 

metallopanstimulin 1 


3e-37 


1171014 


40S RIBOSOMAL PROTEIN 
S27 growth factor-inducible zinc 
finger protein MPS-1 - human 
>gi|43l319 (LI9739) 
metallopanstimulin [Homo 
sapiens] >gi| 1373421 (U57847) 
ribosomal protein S27 


1.4 


1342 


Y 15054 


Rattus norvegicus 
mRNA for 70 kDa 
tumor specific 
antigen, partial 


3e-37 


3123027 


70 KD WD-REPE AT TUMOR- 
SPECIFIC ANTIGEN 
>gi|2505957|gnI|PID|e353992 
(Y 15054) 70 kD tumor-specific 
antigen [Rattus norvegicus] 


2e-15 


1343 


AF084205 


Rattus norvegicus 
serine/threonine 
protein kinase TAOl 
mRNA. complete cds 


3e-37 


3452473 


(AF084205) serine/threonine 
protein kinase TAOl [Rattus 
norvegicus) 


5e-4" 


1344 


X78604 


R. norvegicus 
(Sprague Daw ley) 
ARL5 mRNA for 
ARF-iike protein 5 


le-37 


<NONE> 


<NONE> 


<NONE> 



WO 01/02568 



PCT/US00/18374 





Nearest Neishbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1345 


AJ236644 


Homo sapiens 
chromosome 22 CpG 
island DNA, genomic 
Msel fragment, clone 
22CGIB49A3 . 
complete read 


Ie-37 


2239219 


(^y/2iuj nypoineticai protein 


OC-UJ 


1346 


U09367 


Human zinc finger 
protein ZNF136 


4e-39 


2137269 


DNA-binding protein - mouse 
> g i|437444 


7e-23 


1347 


Z69649 


Human DNA 
sequence from 
cosmid L69F7B, 
riuniingion s uiscuac 
Region, chromosome 
4pl6.3 contains 
Huntington Disease 
(HD) gene. 






(AL023094) putative cyclase 
associated protein CAP 
[Arabidopsis thaliana] 


J.O 


1348 


AF065389 


Homo sapiens 
tetraspan NET-4 
mRNA. complete cds 


le-39 


3152703 


(AF065389) tetraspan NET-4 
[Homo sapiens] 


6e-29 


1349 


AF038172 


Homo sapiens clone 
23923 mRNA 
sequence 


le-40 


1813464 


(U60883) CapC [Bacillus 
firmus] 


2.8 


1350 


Z83095 


H.sapiens Fanconi 
anaemia group A 
gene, exons 39, 40, 
41,42 and 43 


le-40 


2137870 


zinc finger protein - mouse 
(fragment) 


3e-23 




AF057734 


Homo sapiens 17- 
beta-hydroxysteroid 
dehydrogenase IV 
(HSD17B4) gene, 
exon 16 


le-40 


2842416 


(AL008730)dJ487J7.l.l 

/"fMititltfO rr~\ tain riT^lx/I / 1 

ipuuLUvc prvjicin tuto / j / . 1 
isoform 1) [Homo sapiens] 


6e-61 


1352 


AF070567 


Homo sapiens clone 
^4544 beta- 
dystrobrevin mRNA, 
partial cds 


4e-41 


3133087 


(Y 157 18) dystrobrevin B DTN- 
B2 [Homo sapiens] 


7c- 13 


1353 


AF006088 


Homo sapiens Arp2/3 
protein complex 
subunit pl6-Arc 
(ARC 16) mRNA, 
complete cds 


2e-4l 


3121767 


ARP2/3 COMPLEX 16 KD 
SUBUNIT 


3e-36 


1354 


X69942 


M.musculus mRNA 
of enhancer-trap- 
locus 1 


6e-42 


2291152 


(AF0I641S) No definition line 
found [Caenorhabditis elegans] 


6.4 
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Nearest Neighbor f BlasiN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1355 


X87838 


H.sapiens mRNA for 
beta-catenin 


5e-42 


1373019 


(U28811)cysteine-rich 
fibroblast growth factor receptor 


8c-05 


1356 


AB018268 


Homo sapiens mRNA 
for KIAA0725 
protein, partial cds 


5e-42 


3882171 


(AB0I8268) KIAA0725 protein 
[Homo sapiens] 


2e-33 


1357 


M84424 


Human cathepsin E 
(CTSE) gene, exon 9 
and complete cds. 


2e-42 


<NONE> 


<NONE> 


<NONE> 


1358 


U80776 


Human EST clone 
NIB 1543 mariner 
transposon Hsmarl 
orf gene, complete 
cds 


2e-42 


2231380 


(U8U776) ort; encodes putative 
chimeric protein with SET 
domain in N-terminus with 
similarity to several other 
human, Drosophila, nematode 
and yeast proteins [Homo 
sapiens] 


3c- 11 


1359 


U55184 


Human G protein 
Golf alpha gene, exon 
12 and complete cds 


2e-42 


3165531 


(AF067608) No definition line 
found [Caenorhabditis elegansl 


le-16 


1360 


AC00519O 


Homo sapiens PAC 
clone DJ1152D16 
from Xq23, complete 
sequence [Homo 
sapiens] 


6e-43 


2978255 


(AB007407) myeloid zinc finger 
protein-2 [Mus musculus] 


2.3 


1361 


ABO 18284 


Homo sapiens mRNA 
for KIAA0741 
protein, complete cds 


5e-43 


<NONE> 


<NONE> 


<NONE> 


1362 


AB011137 


Homo sapiens mRNA 
forKIAA0565 
protein, complete cds 


5e-43 


3043654 


(AB011137) KIAA0565 protein 
[Homo sapiens] 


le-07 


1363 


M93651 


Human set gene, 
complete cds. 


2e-43 


<NONE> 


<NONE> 


<NONE> 


1364 


Z47087 


H.sapiens mRNA for 
RNA polymerase II 
elongation factor-like 
protein. 


2e-43 


1872514 


(U84404) E6- associated protein 
E6-AP/ubiquitin-protein ligase 
[Homo sapiens] >gi|2361031 
(AF01670S) E6-AP ubiquitin- 
protein ligase [Homo sapiens] 


7.2 


1365 


U27197 


Drosophila 
melanogaster pelota 
(pelo) mRNA. 
complete cds 


2e-43 


1352736 


PELOTA PROTEIN >gi|973224 
(U27197) pelota [Drosophila 
melanogaster] 


le-46 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












RRP5 PROTEIN HOMOLOG 




1366 


D80007 


Human mRNA for 
KIAA0I85 gene, 
partial cds 


6e-44 


2498864 


(KIAA0185) hypothetical 
protein YM9959. 11C of 
S.cerevisiae. [Homo sapiens] 


6e-09 


1367 


AF005039 


Homo sapiens 
secretory carrier 
membrane protein 
(SCAMP3) mRNA, 
complete cds 


6e-44 


2232243 


(AF005039) secretory carrier 
membrane protein [Homo 
sapiens] 


2e-09 


1368 


X68101 


R.norvegicus trg 
mRNA 


2e-44 


550420 


(X68101) trg gene product 
[Rattus norvegicus] 


le-37 


1369 


AF044206 


Homo sapiens 
cyclooxygenasc 
(COX-2) gene, 
promoter and exon I 


2e-45 


2072953 


(U93565) putative pl50 [Homo 
sapiens] 


5e-06 


1370 


L48708 


Homo sapiens 
faciogenital dysplasia 
(FGDl)gene, 5' end 
of intron 17 


8e-46 


<NONE> 


<NONE> 


<NONE> 


1371 


X 15822 


Human COX VIIa-L 
mRNA for liver- 
specific cytochrome c 
oxidase (EC 1.9.3.1.) 


3e-46 


117121 


<J Y 1 ULHRUME U UX1D ASL 
POLYPEPTIDE VIIA-L I VER 
PRECURSOR 
>gi|2I44370|pir||OSHU7L 
cytochrome-c oxidase (EC 
1.9.3.1) chain Vila precursor, 
hepatic - human >gi|30147 
(X15S22) precursor (AA -23 to 
60) [Homo sapiens] 


5e-13 


1372 


U47323 


Mus musculus 
stromal cell protein 
mRNA, complete cds 


3e-46 


1493833 


(U47323) stromal cell protein 
Mus musculus] 


le-4S 


1373 


AF059524 


Homo sapiens 
reticulon gene family 
protein 


7e-47 


1731169 


HYPU1HL11CAL 113.1 KD 
PROTEIN T28D9.7IN 

(w-nXUMUotJiVLb 11 >gl|csoliOH 

(U28738) coded for by C. 
elegans cDNA yk8h5.3; coded 
for by C. elegans cDNA 
ykSh5.5; similar to C. elegans 
deg-1 and mec-4 in exon 2 
Caenorhabditis elegans] 


7.S 


1374 


AJ 132583 


Eiomo sapiens mRNA 
for puromycin 
sensitive 
aminopeptidase, 
partial | 


3e-47 


1777519 


(U39123) T cell receptor beta 
:hain [Homo sapiens] 


9.7 



^51 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non- Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1375 


M97856 


Homo sapiens histone 
binding protein 
mRNA, complete cds. 


3e-47 


2645327 


(U83821) NADH 
dehydrogenase subunit 3 
[Oryzomys paiustris] 


5.7 


1376 


U53220 


Human 

retinoblastoma- 
related Rb2/pl30 
gene, 5' flanking 
region and partial cds 


3e-47 


2499225 


CMP- SIALIC ACID 
TRANSPORTER CMP-sialic 
acid transporter [Cricetulus 
griseus] 


5.3 


1377 


X87870 


H. sapiens mRNA for 
hepatocyte nuclear 
factor 4a 


le-47 


728832 


!!!! ALU SUBFAMILY SB 
WARNING ENTRY 


7.3 


1378 


AF060195 


Mus muse u I us 
proteasome regulator 
PA28 beta subunit 
gene, complete cds 


3e-48 


478681 


limb deformity protein - chicken 


0.25 


1379 


ABO 18285 


Homo sapiens mRNA 
forKIAA0742 
protein, partial cds 


ie-48 


3122969 


TESTIS SPfiCIHC PROTEIN 
A (ZINC FINGER PROTEIN 
TSGA) >gi|281040|pir||S28499 
probable zinc finger protein - rat 
>gi|57504 (X59993) zinc finger 
protein 


le-30 


13SO 


U35032 


Human endogenous 
retrovirus clone 
c5.il, HERV-H 
multiply spliced 
subgenomic leaden 
protease and integrase 
region mRNA, partial 
cds 


4e-49 


88558 


retroviral proteinase-like protein 
- human 


6e-05 


1381 


AB007956 


Homo sapiens 
mRNA, chromosome 
1 specific transcript 
KIAA0487 


ie-49 


<NONE> 


<NONE> 


<NONE> 


1382 


D86987 


Homo sapiens mRNA 
forKIAA0214 
protein, complete cds 


le-49 


2497944 


ALPHA SCRUIN >gi|63323S 
(Z38132) scruin [Limulus 
polyphemus] 

>gi|1093326|prfl|2l03269A 
scrulin [Limulus sp.] 


9.7 


1383 


U25826 


Human transcription 
factor (SC I ) gene, 
complete cds. 


4e-50 


<NONE> 


<NONE> 


<NONE> 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neiehbor (BlasiX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






X/Inc miKriilu^ ATP- 










1384 


U46690 


dependent RNA 
helicase mRNA, 
partial cds. 


4e-50 


1335873 


(U46690) ATP-dependent RNA 
helicase [Mus musculus] 


3e-24 


1385 


/VTU/ ZiZo 


Mus musculus 
c laud in -2 mRNA, 
complete cds 






(AF072128) claudin-2 [Mus 


4e-24 


1386 


AF093593 


Homo sapiens 
snRNA activating 
nrofein cnmnlex 

19kDa subunit 
complete cds 


le-50. 


3668416 


(AF093593) snRNA activating 
protein complex 19kDa subunit 
[Homo sapiens] 


0.003 


1387 


U79745 


Homo sapiens 
monocarbox vlate 
transporter 
homologuc MCT6 
mRNA, complete cds 


le-50 


1177607 


(X92485) pval [Plasmodium 
vivax] 


2e-07 


1388 


L09647 


Rriftitc nnrvpffici]^ 

hepatocyte nuclear 
factor 3a 


le-50 


404764 


(L10409) fork head related 
protein [Mus musculus] 


2e-21 


1389 




Mouse E46 mRNA 

i\Jl CHO piULClIl 


de-Si 




RRATN PROTFIN E46 


le-20 


1390 


M33387 


Human debrisoquine 
4-hydroxylase 
(CYP2DSP) and 


le-51 


126296 


LINE-1 REVERSE 
TRANSCRIPTASE 
HOMOLOG protein 
[Nycticebus coucang] 


5c- 15 


1391 


AF019767 


Homo sapiens zinc 
finger protein (ZPR1) 
mRNA, complete cds 


4c-52 


961507 


(D63788) anchor protein, LCM 


5.9 


1392 


Z37986 


H.sapiens mRNA for 
phenylalkylamine 
binding protein. 


2e-52 


<NONE> 


<NONE> 


<NONE> 


1393 


U65416 


Human MHC class I 
molecule (M1CB) 
gene, complete cds 


2e-52 


3878637 


(Z49I2JS) weak similarity with 
SINR protein (Swiss Prot 
accession number P06533); 
cDNA EST EMBL;T00631 
comes from this gene; cDNA 
EST yk293d 10.5 comes from 
this gene [Caenorhabditts 
dedans] 


8.7 
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Nearest 


Neighbor (BlastN vs. Genbank) 


_ Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


i DESCRIPTION ' 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 


1394 


Z57647 


H.sapiens CpG DNA 
clone I89a6. forward 
read cpgI89a6.ftla . 


2e-52 


111187 


beta-giobin DNA-binding 
protein Bl, transcription factor 
PU.l - mouse >gi|200586 
(M32370) PU.l protein (Mus 
musculus j >gi[zuuy/ 1 4 
(M38252) transcription factor 
Pu.l [Mus musculus] 


5.8 


1395 


L13738 


Human activated 
p21cdc42Hs kinase 
(ack) mRNA. 
complete cds. 


2e-52 


2921447 


(AF037260) non-receptor 
protein tyrosine kinase Ack 
(Mus musculus] 


7e-23 


1396 


AF042379 


Homo sapiens spindle 
pole body protein 
spc97 homolog GCP2 
mRNA, complete cds 


7e-53 


ZoUl l\JL 


(AF042379) spindle pole body 
protein spc97 homolos GCP2 


Ic-16 


1397 


AF047441 


Homo sapiens RNA 
polymerase 1 40kD 
subunit mRNA, 
complete cds 


6e-53 


3914807 


UNA-UlRkcrKD RM'A 
POLYMERASE 1 40 KD 
POLYPEPTIDE (RPA40) 
(RPA39) >gi|2266929 
(AF008442) RNA polymerase I 
subunit hRPA39 [Homo 
sapiens] 


4e-19 


1398 


AF1O4670 


Homo sapiens cell 
cycle protein 
(PA2G4) gene, exbns 
6 through 13, and 
complete cds 


2e-53 


<NONE> 


<NONE> 


<NONE> 


1399 


t 
1 
r 

S60754 I 


{ VNTR locus DXZ4, 
hypcrvariable tandem 
repeat cluster} 
human, Genomic, 
2991 nt]>:: 
gb|L07935|HUMVNT 
^A Homo sapiens 
nicrosatellite VNTR 
)NA sequence. 


2e-53 


( 
j 

( 
1 

1209669 r 


U38810)CAGR1 [Homo 
sapiens] >gi|3098420 
AF040945) homeotic regulator 
iomologMAB2I [Mus 
nusculus] 


4.6 


1400 


I 
I 

D86972 c 


■luman mRNA for 
ClAA021Sgene, 
omplcte cds 


le-53 


( 

3426041 


AC005168) unknown protein 
Arabidopsis thaliana] 


9.1 



(do 
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Nearest Neighbor f BlastN vs. Genbank) 



Nearest Neighbor (BiastX vs. Non-Redundant Proteins) 



1401 



ACCESSION | DESCRIPTION | P VALUE 



AJ236682 



ACCESSION 



Homo sapiens 
chromosome 22 CpG 
island DNA. genomic 
Msel fragment, clone 
22CGIB49E6 , 
complete read 



DESCRIPTION 



7e-54 



3928721 



(AL034355) putative 
cytochrome oxidase subunit I 
[Streptomyces coelicolorl 



P VALUE 



0.30 



1402 



AJ236682 



Homo sapiens 
chromosome 22 CpG 
island DNA, genomic 
Msel fragment, clone 
22CGIB49E6 , 
complete read 



1403 



M37583 



Human histone 
(H2A.Z) mRNA, 
complete cds. 



6e-54 



3928721 



(AL034355) putative 
cytochrome oxidase subunit I 
Streptomyces coelicolor] 



6e-54 



70711 



histone H2A.F, embryonic - 
chicken 



0.28 



2e-16 



1404 



AJ009947 



Homo sapiens mRNA 
for putative ATPase, 
martial 



6e-54 



3550295 



(AJ009947) putative ATPase 
Homo sapiens] 



3c- 1 8 



1405 



Y08459 



B.taurus mRNA for 
novel cytoplasmic 
protein 



1406 



AF042384 



Homo sapiens BC-2 
protein mRNA, 
complete cds 



2e-54 



<NONE> 



<NONE> 



2e-54 



2828147 



(AF042384) BC-2 protein 
Homo saptensl 



<NONE> 



2e-l4 



1407 



1408 



AF042379 



Homo sapiens spindle 
pole body protein 
spc97 homolog GCP2 
mRNA, complete cds 



8e-55 



2801701 



(AF042379) spindle pole body 
jrotein spc97 homolog GCP2 



2c- 17 



AF005355 



Oryctolagus. 
cuniculus translation 
initiation factor 
eIF2C mRNA, 
c omplete cds 



7e-55 



3253159 



(AF005355) translation 
nitiation factor eIF2C 



1409 



AF008442 



Homo sapiens RNA 
polymerase I subunit 
hRPA39 mRNA, 
>mplete cds 



1410 



AF047441 



Homo sapiens RNA 
polymerase I 40kD 
subunit mRNA, 
com plete cds 



3e-55 



3335138 



(AF047441) RNA polymerase I 
40kD subunit [Homo sapiens] 



3e-55 



3335138 



(AF047441) RNA polymerase I 
40kD subunit [Homo sapiens] 



Met 



3e-53 



3e-20 



3e-20 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Human mRNA for 










i4ii 


X08004 


Rap IB protein > :: 
emb|A08693|AO8693 
H.sapiens rap lb 
cDNA 


2e-55 


* 

539995 


transforming protein rap lb - rat 
(strain Cope n ha sen) 


2e-18 


1412 


AFO 10403 


Homo sapiens ALR 
mRNA, complete cds 


2e-55 


2358285 


(AF0 10403) ALR [Homo 
sapiens] 


le-49 


1413 


M77016 


Human tropomodulin 
mRNA, complete cds. 


8e-56 


262249 


(S52010) orf 1 5' of EpoR [mice, 
Peptide, 85 aa] [Mus sp.] 


0.027 


1414 


AB020633 


Homo sapiens mRNA 
for KIAA0826 
protein, partial cds 


2e-56 


<NONE> 


<NONE> 


<NONE> 


1415 


X87489 


H.sapiens genomic 
DNA (chromosome 
3;cloneNL1243D) 


2e-56 


1814029 


(U84501) cuticle collagen 
[Caenorhabditis briggsael 


0,038 


1416 


AB007893 


Homo sapiens 
KIAA0433 mRNA, 
partial cds 


2e-56 


2887437 


(AB007893) KIAA0433 [Homo 
sapiens] 


9e-2l 


1417 


X78925 


H.sapiens HZF2 
mRNA for zinc finger 
protein 


lc-56 


3342002 


(AF054180) hematopoietic cell 
derived zinc finger protein 
[Homo sapiens] 


2e-21 


1418 


Z56281 


H.sapiens mRNA for 
interferon regulatory 
factor 3 


9e-57 


2497442 


INTERFERON 
REGULATORY FACTOR 3 
factor 3 [Homo sapiens] 


2e-21 


1419 


U78772 


Homo sapiens nuclear 
VCP-like protein 
NVLp.l 


8e-57 


2406565 


(U68140) nuclear VCP-like 
protein NVLp.2 [Homo sapiens] 


5e-20 


1420 


D79994 


Human mRNA for 
KIAA0172 gene, 
partial cds 


3e-57 


1136404 


(D79994) similar to ankyrin of 
Chromatium vinosum. [Homo 
sapiens] 


9e-38 


1421 


AB002342 


Human mRNA for 
KIAA0344 gene, 
complete cds 


le-57 


2224629 


(AB002342) KIAA0344 [Homo 
sapiens] 


4e-20 


1422 


LI 9437 


Human transaldolase 
mRNA containing 
transposable element, 
complete cds 


le-57 


1553119 


(U63159) transaldolase [Mus 
musculus] 


2e-20 


1423 


D17532 


Human mRNA for 
RCK. complete cds 


9e-58 


129376 


PROBABLE ATP- 
DEPENDENT RNA 
HELICASE P54 (ONCOGENE 
RCK) (DEAD BOX PROTEIN 
6) 


le-10 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1424 


X79568 


H.saptens BDPI 
mRNA for protein- 
tyrosine-phosphatase 


9e-58 


1871531 


(X79568) protein-tyrosine- 


le-22 


1425 


X79568 


H.sapiens BDPI 
mRNA for protein- 
tyrosine- phosphatase 


9e-58 


1871531 


(X79568) protein-tyrosine- 
phosphatase 


9e-23 


1426 


ABO 12295 


Homo sapiens 
HKE1.5 mRNA for 
GDS-related protein, 
complete cds 


7e-58 


2648021 


(297 1841 RGL2 fHomo saoiensl 


9e-19 


1427 


AF086040 


Homo sapiens full 
length insert cDNA 
clone YX52E07 


. lc-58 


543222 


glutamine (Q)-rich factor 1, 
QRF-1 - mouse factor L QRF-1 
[mice, B-cell leukemia, BCL1, 
Peptide Partial, 84 aa] 


3e-36 


1428 


AB018195 


Homo sapiens ca xi 
mRNA for carbonic 
anhydrase- related 
protein XL complete 
cds 


4e-59 


<NONE> 


<NONE> 


<NONE> 


1429 


AF071777 


Mus musculus IRE1 
(Ire I) mRNA, 
complete cds 


4e-59 


3766209 


(AF071777)IREl [Mus 
musculus] 


7e-28 


1430 


AB000462 


Homo sapiens mRNA 
forSH3 binding 
protein, complete cds, 
clone:RES4-23A 


3e-59 


<NONE> 


<NONE> 


<NONE> 


1431 


AF038172 


Homo sapiens clone 
23923 mRNA 
sequence 


3e-59 


3758855 


(Z9855DMAL3P6.il 
Plasmodium falciparum] 


1.3 


1432 


Z84812 


Human DN A 
sequence from phage 
pTEL from a contig 
from the tip of the 
short arm of 
chromosome 16, 
spanning 2Mb of 
1 6pl 3.3 Contains 
ESTs 


le-59 


400927 


RIBONUCLEOPROTEIN 
RB97D ribonucleoprotein 
Drosophila melanogaster] 


2.5 


1433 


U364S4 


Human laminin- 
Mnding protein gene, 
partial cds. and E2 
small nucleolar RNA 
jiene, complete 
sequence 


le-59 


226005 


protein 40kD [Mus musculus] 


7e-05 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 










DUAL SPECIFICITY 




1434 


LI 1285 


Homosapiens ERK 
activator kinase 
(MEK2) mRNA. 


le-59 


2499630 


MITOGEN- ACTIVATED 
PROTEIN KINASE KINASE 2 
(MAP KINASE KINASE 2) 
(MAPKK 2) kinase type 2 
[Gallus gall us] 


3e-21 


1435 


AF086555 


Homo sapiens full 
length insert cDNA 
clone ZE14E04 


4e-60 


3287674 


(AC005239) F23149.1 [Homo 
sapiens] 


2e-04 


1436 


M24766 


Human (clone 
pHAIV2-12) alpha-2 
collagen type IV 


4e-60 


29551 


(X05610) alpha (2) chain 
[Homo sapiens] 


. 6e-15 


1437 


X65550' 


H.sapiens mki67a 
mRNA (long type) 
for antigen of 
monoclonal antibody 
Ki-67 


4e-60 


1170654 


ANTIGEN KI-67 
>gi|539555|pir||A48666 cell 
proliferation antigen Ki-67, long 
form - human Ki-67 [Homo 
sapiens] 


3e-I5 


1438 


M27319 


Human calmodulin 
mRNA. complete cds. 


4e-60 


1345451 


(X05949) Calmodulin (AA 2 - 
59) (449 is 1 st base in codon) 
[Drosophila melanogaster] 


7e-20 


1439 


Y12781 


Homo sapiens mRNA 
for transducin (beta) 
like 1 protein 


3e-60 


62133 


(X06172) put. 134 kD protein 
(AA 1 - 1 187); put. repiicase 


7.4 


1440 


AB002383 


Human mRNA for 
KIAA0385 gene, 
complete cds 


le-60 . 


1001548 


(D64000) hypothetical protein 


4.4 


1441 


AF070614 


Homo sapiens clone 
24732 unknown 
mRNA. partial cds 


2e-6I 


3283879 


(AF070614) unknown [Homo 
sapiens] 


3e-I7 


1442 


AB002326 


Human mRNA for 
KIAA0328 gene, 
partial cds 


6e-62 


547891 


MICROTUBULE- 
ASSOCIATED PROTEIN 4 
microtubule-associated protein- 
U [Bos taurus] 


5.6 


1443 


AF086471 


Homo sapiens full 
length insert cDNA 
clone ZD88A01 


5e-62 


<NONE> 


<NONE> 


<NONE> 


1444 


] 

AB0023U ( 


Human mRNA for 
KIAA0313gene, 
:omp!ete cds 


2e-62 


2506357 < 


DIHYDROXYPHENYLPROPI 
ON ATE 1.2-DIOXYGENASE 
>gi|1657544 (U73357) similar 
to mcpl gene (catechol 2.3- 
dtoxygenase) of A. eutrophus 3- 
[2,3- " 

dihydroxyphenylpropionaie)l, 2- 
dioxygenase 2.3- 
dihydroxyphenylpropionate 1,2- 
dioxyaenase 


3.4 
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SEQ 
ID 



Nearest Neighbor (BlastN vs. Genbank) 



ACCESSION 



DESCRIPTION 



P VALUE 



Nearest Neighbor (BlastX vs. No n- Redundant Proteins) 



ACCESSION 



DESCRIPTION 



P VALUE 



1445 



AF069737 



Xenopus laevis 
notchless (nle) 
mRNA. complete cds 



2e-62 



3687833 



(AF069737) notchless [Xenopus 
laevis I 



Ic-55 



1446 



AF044209 



Homo sapiens nuclear 
receptor co-repressor 
N-CoR mRNA, 
complete cds 



1447 



M69238 



Human aryl 
hydrocarbon receptor 
nuclear translocator 
(ARNT) mRNA, 
complete cds. 



5e-63 



Inuclear receptor co-repressor N- 
CoR - mouse musculus] 
>gi|1583865|prf||212I436A 
(thyroid hormone receptor co- 
2137603 repressor [Mus musculusl 



2e-63 



2702319 



(AF0O1307) aryl hydrocarbon 
receptor nuclear translocator; 
{Arm [Homo sapiens] 



2e-47 



5e-19 



1448 



X80497 



H.sapiens PHKLA 
mRNA 



2e-63 



1170685 



TPHUSPHUK V LASh H 
KINASE ALPHA 
REGULATORY CHAIN, 
LIVER ISOFORM 
(PHOSPHORYLASE KINASE 
I ALPHA L SUB UNIT) 
>gi|663010(XS0497) 
jphosphorylase kinase 
[phosphorylase kinase alpha 
jsubuni t [Homo sapiens] 



5e-22 



1449 



AF031I4I 



Homo sapiens 
ubiquitin conjugating 
enzyme 



2e-63 



2623260 



(AF031141) ubiquitin 
conjugating enzyme [Homo 
sapiens] 



le-23 



1450 



237166 



H.sapiens B ATI 
mRNA for nuclear 
RNA helicase 



6e-64 



2500529 



PROBABLE ATP- 
DEPENDENT RNA 
HELICASE P47 
>gi|2135840|pir||I3720I nuclear 
RNA helicase (DEAD family) 
BAT I - human >gi|587146 
(Z37166) nuclear RNA helicase 
|(DEAD family) [Homo sapiens] 



9e-24 



1451 



M64240 



Human neiix-ioop- 
helix zipper protein 
(max) mRNA, 
complete cds. > :: 
gb|I4H38|I41138 
Sequence 1 from 
patent US 5624818 > 

gb|I77062|I77062 
Sequence 1 from 
patent US 5693487 



5e-64 



Myc-binding factor Max, short 
83175 [form - human 



8e-22 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSIOI^ 


i DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 










MOCOLLaGEN-LVSINE 




1452 1 M98252 


Homo sapiens lysyi 
hydroxylase (partial 
clone 2.2 Kb LH) 
RNA, complete 
mature peptide. 


2e-64 


400205 


OXOGLUTARATE5- 
DIOXYGENASE 
PRECURSOR (LYSYL 
HYDROXYLASE) lysyl 
- hydroxylase [Homo sapiens] 


7e-22 


1453 U09550 


Human oviductal 
glycoprotein mRNA, 
complete cds. 


8e-65 


2493676 


O VIDUCT-SPfiCMt 1 
GLYCOPROTEIN 
PRECURSOR (OVIDUCTAL 
GLYCOPROTEIN) 
^oviDiirTnvn 


9~ i i 


14541 X67877 


R.norvegicus mRNA 
for cytosolic 
resiniferatoxin- 
binding protein 


7e-65 


423664 


resimteratoxin-binding protein 
RBP-26, cytosolic - rat 
>gi|3 1 1660 (X67877) cytosolic 
resiniferatnxin hinHin<j nrnfpin 
RBP-26 [Rattus norvegicus] 
>gi| 1 093373 |prfl|2 10331 OA 
resiniferatoxin-binding protein 




14551 AB018254 


Homo sapiens mRNA 
for KIAA071I 
protein, complete cds 


6e-65 


92298 


glutamine/glutamic acid-rich 
protein 


0.9S 


1456 J J03607 


Human 40-kDa 
keratin intermediate 
filament precursor 
gene. 


3e-65 


1070608 


keratin 19, type I, cytoskeletal - 
human sapiens] 


4e-07 


1457 U65896 


Human gamrna- 
glutamyl carboxylase 
gene, complete cds 


2e-65 


<NONE> 


<NONE> 


<NONE> 


1 1 
1 

1 c 

1 5 
14581 U07681 


ttuman NAD(H)- 
fpecific isocitrate 
iehydrogenase alpha 
ubunit precursor 
nRNA, complete cds. 


2e-65 


] 

c 

f 

s 

1708399 s 


iOUCtiKAlb 

DEHYDROGENASE (NAD), 
MITOCHONDRIAL SUB UNIT 
ALPHA PRECURSOR 
[ISOCITRIC 

DEHYDROGENASE) (NAD+- 
5PECIFIC ICDH) 
iehydrogenase alpha chain 
precursor - human >gi|706839 
ubunit precursor [Homo 
apiens] 


4e-26 


1 F 
1 € 

14591 U8S080 c 


-luman zinc finger 
>rotein (LD5-1) gene, 
xons 4, 5 and 6, and 
omplete cds 


2e-65 


( 

1373394 


U57796) zinc finger protein 
Homo sapiens] >ei|2306773 


2e-39 
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Nearest 


Neighbor (BlastN vs. Genbank) 


_ ___ Nearest Neighbor (BlastX vs. Non-Redundant Proteins^ 


CPA 

ID 


ACCESSIOr 


* DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION! 


r VALUE 


1460 


M96625 


Gallus domesticus 

tensinmRNA 

sequence. 


3e-66 


2134419 


tensin - chicken (fragment) 
>gi|co8Q5 (Z 1 8529) tensin 
[Gallus gallus] >gi|2 12755 
(L06662) tensin [Gallus gallus] 


lc-51 


1461 1 


U13262 


Mus musculus myelir 
gene expression 
factor (MEF-2) 
mRNA, partial cds. 


i 

ie-70 


536926 


(U 13262) myelin gene 
expression factor [Mus 
musculus] 


9e-42 


14621 


U64033 


Mus musculus Tera 
(Tera) mRNA, 
complete cds 


5c-72 


1575505 


(U64033) Tera [Mus musculusl 


9e-34 


1463 


X78989 


M. musculus mRNA 
for testin 


6e-74 


1351218 


TPQT7W /TTT CO \ 

[CONTAINS: TESTIN 1 


8c-31 


1464 1 


LF64033 


Mus musculus Tera 
(Tera) mRNA, 
complete cds 


2e-74 


1575505 


(U64033) Tera [Mus musculusl 


5e-37 


14651 


AF057365 


Canis familiaris UDP 
N-acetylglucosamine 
transporter mRNA, 
complete cds 


9e-79 


3298605 


( A H > nC7'7<IC\ T TTM~> \T 

(ArlD/Jo3) UDPN- 

acetyl glucosamine transporter 

[Canis familiaris] 


9c- 10 


14661 


AJ006064 


Rattus norvegicus 
mRNA for coronin- 
like protein 


ie-82 


3757680 


(AJ006064) coronin-like protein 
[Rattus norveaicus] 


3e-62 


14671 


U91582 


Macaca fascicularis 
UDP- 

glucuronosyltransfera 
se mRNA, complete 
cds 


4e-89 


140396 


KARYOGAMY PROTEIN 
KAR4 yeast (Saccharomyces 
cerevisiae) 


le-OS 


1468 


X06762 


Mouse Hox2.3 
mRNA 


3e-92 


123255 ] 


HOMEOBOX PROTEIN HOX- 
B7 (HOX-2C) 


9e-23 


1 ( 

1 ! 
1 I 

1 h 

14691 AB016930 c 


^ricetulus griseus 
nRNAfor 

3 hosphatidylglycerop 
losphate synthase, 
omplete cds 


5e-94 


{ 
f 

4159682 s 


ABO 16930) 

Phosphatidyl glycerophosphate 
ynthase [Cricetulus griseus] 


7e-34 


U7o| 


X74504 n 


rt.musculus T10 
nRNA 


7e-97 


S 
1 
> 

1711658 1 


;er/thr-rich protein 
'10 in dgcr region 

gi|480900|pir||S3748S gene 
'10 protein - mouse 


3e-59 



iK/t 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


f DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












■ umvui i hv-cuivjuua i mu 1 




1471 


U13175 


Rattus norvegicus 
clone ubc 10a 
ubiquitin conjugating 
enzyme (E217kB) 
mRNA, complete cds. 


3e-98 


1351345 


ENZYME E2- 1 / &XT 3 

(UBIQUITIN-PROTEIN 
LIGASE) (UBIQUITIN 
CARRIER PROTEIN) 
(E2(17)KB 3) 
>gi|1085588|pir||S53358 
ubiquitin conjugating enzyme 
(E2i7kB)-rat >gi|595666 
(U 13 175) ubiquitin conjugating 
enzyme [Rattus norvegicus] 
norvegicus] >gi| 1 145691 
(U39318) UbcH5C (Homo 
sapiens] 


5e-05 


1472 


S79873 


h-lamp-2=Iysosome- 
associated membrane 
protein-2 protein-2b 
(LAMP2) mRNA, 
alternatively spliced 
form h- lamp- 2 b, 
complete cds. 


e-119 


<NONE> 


<NONE> 


<NONE> 


1473 


D 1 3623 


Rat mRNA for p34 
protein, complete cds 


e-112 


480379 


ribosome-binding protein p34 - 
rat sp.] 


2e-05 


1474 


ABO 1 3357 


Mus musculus mRNA 
for 49 kDa zinc finger 
protein, complete cds 


c-136 


4153886 


(AB013357) 49 kDa zinc finger 
protein 


5e-08 


1475 


AB016930 


Cricetulus griseus 
mRNA for 

Phosphatidylglycerop 
hosphate synthase, 
complete cds 


e-117 


41 59682 


(AB016930) 

Phosphatidylglycerophosphate 
synthase [Cricetulus griseus] 


4e~32 


1476 


< 

U38253 i 


Rattus norvegicus 
nitiation factor elF- 
ZB gamma subunit 
r eEF-2B gamma) 
tiRNA, complete cds 


e-103 


] 
] 

2494312 s 


rRANSLATION INITIATION 
FACTOR EIF-2B GAMMA 
SUBUNIT (EEF-2B GDP-GTP 
EXCHANGE FACTOR) 
tubunit [Rattus norvegicus] 


3e-42 
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Nearest Neishbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEC 
ID 


> 

ACCESSIOh 


4 DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1477 


X73683 


R.norvegicus mRNA 
for histone H3.3 


e-117 


122075 


-(II3JQ)liiju?ncIIj.j fiuiiflj 
(Drosophila melanogaster) 
histone H3.3B - chicken 
>gi|2I19023|pir||S61218 histone 
H3.3 - fruit fly (Drosophila 
hydei) 1-136) [Oryctolagus 
cuniculus] >gi|8046 (X53822) 
Histone H3.3Q gene product 
[Drosophila melanogaster] 
>gi|5U98gaIIus} >gi|161I90 
(M17876) histone H3 [Spisula 
solidissima] >gi|211853 
(Ml 1393) histone 3.3 [Gallus 
gallus] >gi|306848(MU354) 
H3.3 histone [Homo sapiens] 
melanogaster] >gi|963031 
(X8 1205) histone H3.3 H3.3A 
variant [Drosophila 
melanogaster] musculus] 


le-45 


1478 


U32498 


Rattus norvegicus 
rsec8 mRNA, partial 
cds 


e-108 


2143962 


rsec8 - rat (fragment) 
>gi|1019441 (U32498) rsecS 
Rattus norvesicus] 


7e-48 


1479 


U41736 


Mus musculus ancient 
ubiquitous 46 kDa 
protein AUP1 
precursor (Aupl) 
mRNA, complete cds 


e-146 


1517822 


(U41736) ancient ubiquitous 46 
kDa protein ALTP46 precursor 
|Mus musculus] 


5e-49 


1480 


ARM 1338 


Bos taurus vacuolar 
proton pump subunit 
SFD alpha isoform 
(SFD) mRNA, 
complete cds 


e-I19 


2895578 


(AF04133S) vacuolar proton 
pump subunit SFD alpha 
soform [Bos taurus] 


3e-49 


1481 


I 

AF064553 < 


Mus musculus NSD1 
xotein mRNA, 
:omplete cds 


e-121 


( 

3329465 


AF064553)NSDI protein 
Mus musculus] 


2e-50 


1482 


1 

( 

s 

AB000517 c 


Rattus sp. mRNA for 
-DP-diacylglycerol 
ynthase, complete 
ds 


e-146 


( 
V 

1517822 f 


U41736) ancient ubiquitous 46 
J3a protein AUP46 precursor 
Mus musculus] 


2e-51 


1483 


f 
I 

D38517 c 


4ouse mRNA for 
)hml protein, 
omplete cds 


c-I18 


n 

2137562 n 


nouse phml protein - mouse 
nusculus] 


6e-54 



Htfl 
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Nearest Neighbor fBlastN vs. Genbank) 



SEQ 

ID I ACCESSION 1 DESCRIPTION 



14841X54352 



1485 | U57692 



M.domesticus MD6 



mRNA 



P VALUE 



e-139 



Mus musculus N- 
terminal asparagine 
amidohydrolase 
(Ntanl) mRNA, 
complete cds 



Nearest Neighbor (BlasiX vs. Non-Rerinnrinm Proteins) 



ACCESSION 



1085499 



e-118 



2498797 



DESCRIPTION 



CDC4 repeat unit-containing 



protein - mouse 



PROTEIN N- TERMINAL 
ASPARAGINE 
AMIDOHYDROLASE 
(PROTEIN NH2 -TERMINAL 
ASPARAGINE DEAMIDASE) 
(NTN-AMEDASE) (PNAD) 
(PROTEIN NH2-TERMINAL 
ASPARAGINE 
AMIDOHYDROL AS E) 
(PNAA) >gi| 1373365 (U57691) 
N- terminal asparagine 
amidohydrolase [Mus musculus] 
amidohydrolase [Mus musculusl 



P VALUE 



le-55 



1486| X80169 



M. musculus mRNA 
for 200 kD protein 



c-119 



1717793 



PROTEIN TSG24 (MEIOTIC 
CHECK POINT 
REGULATOR) 
>gi|10835531pir||A55117 tsg24 



9e-58 



14871 U57692 



Mus musculus N- 
terminal asparagine 
amidohydrolase 
(Ntanl) mRNA, 
complete cds 



e-120 



2498797 



1488| UQ3215 



Mus musculus Hsp70 
related NST-1 (hsr.l) 
mRNA, complete cds 



e-109 



473407 



PROTEIN N- TERMINAL 
ASPARAGINE 
AMIDOHYDROLASE 
(PROTEIN N"H2 -TERMINAL 
ASPARAGINE DEAMIDASE) 
(NTN-AMIDASE) (PNAD) 
(PROTEIN NH2-TERMINAL 
ASPARAGINE 
AMIDOHYDROLASE) 
(PNAA) >gi| 1 373365 (U57691) 
N-terminal asparagine 
amidohydrolase [Mus musculus; 
amidohydrolase [Mus musculus 



(U08215) NST-1 [Mus 
musculus] 



8e-58 



7e-58 
2e-58 



14891 D85926 



Mouse mRNA for 
complete cds 



Ray, 



e-110 



1944389 



(D85926) Rav [Mus musculusl 



1490 1 L20427 



Rattus norvegicus 
dihydroxypolyprenylb 
enzoate 

methyltransferase 
mRNA. complete cds 



e-I23 



457372 



(L20427) 

dihydroxypoKprenylbenzoate 
methyltransferase 
d i hydroxy poly prenyl be nzoate 
methyltransterase [Rattus 
norvegicus 1 



4e-59 
le-60 



.14911 X56Q44 



M.musculus mRNA 
for protein Htf9C 



e-121 



3183977 



(X56044) protein HtP9C [Mus 
musculus] 



W0 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSIO 


1 DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












PftOTO-ONCOGENE 




1492 


S74774 


p59fyn(T)=OKT3- 
induced calcium 
influx regulator 


e-163 


729896 


TYROS INE-PROTEIN 
KINASE FYN (P59-FYN) 
>gi|4202I7|pirf|A4499I protein- 
tyrosine kinase (EC 2.7.1.1 12) 
. fyn - mouse 


8e-63 


1493 


U88873 


Mus musculus BLTB2 
like protein 1 
(HBLP1) mRNA, 
complete cds 


e-123 


4099611 


(U88873) BUB2-like protein I 
[Mus musculus] 


le-63 


1494 


U48852 


Cricetulus griseus HT 
protein mRNA, 
complete cds. 


e-117 


1216486 


(U48852)HT protein 
'Cricetulus griseusl 


7e-64 


1495 


AF032667 


Rattus norvegicus 
rexo70 mRNA, 
complete cds 


e-142 


2827160 


(AF032667) rexo70 [Rattus 
norvegicus] 


5e-66 


1496 


M62722 


Chinese hamster 
phosphatidyl serine 
decarboxylase 
mRNA, 3' end. 


e-114 


118910 


PHUSTHA 1'IDYLSKRINE 

DECARBOXYLASE 

PROENZYME 

>gi|109423|pir||A38732 

phosphatidylserine 

decarboxylase (EC 4.1.1.65) - 

Chinese hamster (fragment) 


2e-67 


1497 


AF072758 


Mus musculus fatty 
acid transport protein 
3 mRNA, partial cds 


e-130 


3335567 


(AF072758) fatty acid transport 
protein 3; FATP3 [Mus 
musculus] 


le-67 


1498 


AB005549 


Rattus norvegicus 
mRNA for atypical 
PKC specific binding 
protein, complete cds 


c-113 


3868778 


(AB005549) atypical PKC 
specific binding protein [Rattus 
norvegicus] 


2e-69 


1499 


U57344 < 


Mus musculus 
homeobox protein 
Vleis3 mRNA, 
complete cds 


e-143 


3024124 


nUMtUoUA rKU I fclfN 
MEIS3 


6e-72 


1500 


} 

U09874 i 


Mus musculus SKD3 
nRNA, complete cds. 


e-I42 


2493735 


SKD3 PROTEIN SKD3 [Mus 
nusculus] 


le-72 


1501 


] 
i 

U72194 c 


VI us musculus 
nuskelin mRNA, 
:omp!ete cds 


e-148 


< 

3493462 i 


U72194)muskelin[Mus 

TlUSCUlus] 


2e-74 


1502 


f 

XS0169 f 


vl. musculus mRNA 
or 200 kD protein | 


e-155 


I 

( 
I 

1717793 5 


PROTEIN TSG24 (MEIOTIC 
I! HECK POINT 
REGULATOR) 

»gi|1083553|pir||A55H7ts?24 


3e-77 



Hi I 
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Nearest I 


Neighbor (BlastN vs. Genbank) 


. Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 


1503 


U72194 


Mus musculus 
muskelin mRNA. 
complete cds 


e-154 


3493462 


(U72194) muskelin [Mus 
musculus] 


2e-78 


1504 


Y12836 


Cricctulus griseus 
mRNA for Zn finger 
factor 


e-146 


3150148 


(Y12836)2n finger factor 
[Chcetulus griseus I 


3e-83 
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Table 5 



ohKl ID 


otart 


Of/,- 

MOp 


Score 


uirecuon 


jjescnpuon 


29 






JO/Z 


r or 


mkk like kinases 


30 


31 


182 


3943 


For 


Basic region plus leucine zipper 
transcription factors 


31 


298 


397 


5625 


For 


mkk like kinases 


186 


175 


395 


7660 


For 


SH2 Domain 


187 


358 


432 


4320 


For 


Ank repeat 


196 


37 


322 


6049 


For 


mkk like kinases 


234 


23 


121 


4607 


For 


SH3 Domain 


308 


110 


172 


4150 


For 


Zinc finger, C2H2 type 


410 


42 


191 


4036 


For 


Basic region plus leucine zipper 
transcription factors 


431 


71 


428 


5538 


Rev 


ATPases Associated with Various 
Cellular Activities 


552 


116 


288 


3930 


Rev 


Basic region plus leucine zipper 
transcription factors 


639 


157 


561 


5797 


For 


ATPases Associated with Various 
Cellular Activities 


746 


209 


427 


5379 


For 


Fibronectin type III domain 


768 


116 


288 


3930 


For 


Basic region plus leucine zipper 
transcription factors 


807 


339 


392 


3620 


For 


Zinc finger, C2H2 type 


820 


341 


406 


2930 


Rev 


EF-hand 


822 


108 


262 


4179 


For 


Basic region plus leucine zipper 
transcription factors 


836 


158 


353 


4430 


For 


Basic region plus leucine zipper 
transcription factors 


1157 


41 


444 


5279 


Rev 


protein kinase 


1192 


186 


416 


5469 


For 


Fibronectin type III domain 


1268 


238 


315 


3540 


For 


Ank repeat 


1269 


79 


240 


11640 


For 


LIM domain containing proteins 


1288 


73 


234 


3953 


For 


Basic region plus leucine zipper 
transcription factors 



■ft 3 
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SEQ ID 


Start 


Stop 


Score 


Direction 


Description 


1309 


248 


404 


8226 


for 


LIM domain containing proteins 


1 324 


2V4 


356 


/iron 

4oyu 


tor 


dtinc linger, L^zriz xype 


1 325 


1 


234 


oVol 


ior 


i^z domain yjrox. Kinase ^ nice j 


1336 


66 


164 


63yu 


tor 


wu domain, o-oeia repeats 


136U 


222 


377 


8686 


tor 


LIM domain containing proteins 


1365 


69 


OCT 

257 


5221 


tor 


Basic region plus leucine zipper 
transcription factors 


1 "> OA 

1 380 


42 


1 yl A 

140 


7130 


ior 


WU domain, U-beta repeats 


1386 


243 


398 


8736 


for 


LIM domain containing proteins 


1410 


222 


350 


10553 


for 


Trypsin 


1417 


8 


354 


6073 


for 


Protein Tyrosine Phosphatase 


1454 


49 


209 


3996 


for 


Basic region plus leucine zipper 
transcription factors 


1 A £ A 

1464 


4 


180 


4978 


tor 


KN A recognition motit. (aka KRM, 
RBD, or RNP domain) 


1 A TO 

1478 


C A 

54 


437 


5176 


tor 


protein kinase 


1496 


241 


A 

520 


1 AO A 

3929 


lor 


Helicases conserved C-terminal domain 


1496 


40 


612 


5187 


for 


protein kinase 


1503 


154 


216 


4870 


for 


Zinc finger, C2H2 type 


1514 


2 


252 


4662 


for 


RNA recognition motif, (aka RRM, 
RBD, or RNP domain) 


1527 


156 


212 


3520 


for 


Zinc finger, C2H2 type 


1538 


9 


635 


11087 


for 


wnt family of developmental signaling 
proteins 


1540 


289 


471 


4107 


for 


Basic region plus leucine zipper 
transcription factors 


1549 


200 


391 


4118 


for 


Basic region plus leucine zipper 
transcription factors 


1556 


163 


354 


3958 


for 


Basic region plus leucine zipper 
transcription factors 


1557 


207 


398 


4038 


for 


Basic region plus leucine zipper 
transcription factors 


1563 


107 


298 


3978 


for 


Basic region plus leucine zipper 
transcription factors 



nn4 
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SEQ ID 


Start 


Stop 


Score 


Direction 


Description 


1622 


180 


365 


4022 


for 


Basic region plus leucine zipper 
transcription factors 


1630 


100 


291 


3998 


for 


Basic region plus leucine zipper 
transcription factors 


1674 


196 


258 


4880 


for 


Zinc finger, C2H2 type 


1676 


9 


86 


6610 


for 


Homeobox Domain 


1677 


316 


369 


5780 


rev 


Thioredoxins 


1688 


109 


410 


17414 


for 


Ras family 


1704 


184 


372 


3977 


for 


Basic region plus leucine zipper 
transcription factors 


1707 


92 


439 


24100 


rev 


Phosphatidylinositol-specific 

pilUopilUJipdoC V_^5 I UUlIlalll 


1 71 1 
1 / 1 1 


AO J 


Jul 


£40.0. 


fnr 

ior 


W U UUITlalll, Vjr-DCla rcpCaLb 


1 7 Ad 


ZJo 




1 0S79 


rev 


C zi-ri r> (± r* Qr\r\rw \ir\(±Y\\ tHqcpc 

ocrinc Caruuxypepuuaoca 


1755 


281 


367 


2580 


for 


EF-hand 


1762 


236 


334 


5880 


for 


WD domain, G-beta repeats 


1779 


64 


126 


4790 


for 


Zinc finger, C2H2 type 


1801 


295 


351 


4030 


for 


Zinc finger, C2H2 type 


1804 


301 


378 


3460 


for 


Ank repeat 


1808 


36 


161 


4170 


for 


Basic region plus leucine zipper 
transcription factors 


1811 


184 


315 


8390 


for 


N-terminal homology in Ets domain 


1814 


127 


294 


10770 


for 


Bromodomain (conserved sequence 
found in human, Drosophila and yeast 

piUlClIlo. ) 


1818 

1 O 1 o 


Q 


146 


4741 
*t /*t i 




Dfuih1f*-QtranHpH RNA Hirtflino mntif 


1 819 

1 O 1 7 


978 


jjj 




fnr 




1820 


123 


299 


12150 


for 


T4nmf*nhnv Domain 


1821 


127 


303 


12180 


for 


Homeobox Domain 


1830 


184 


267 


4270 


for 


Ank repeat 


1832 


18 


173 


8987 


for 


SH3 Domain 


1835 


51 


206 


8987 


for 


SH3 Domain 


1839 


224 


307 


4270 


for 


Ank repeat 


1846 


12 


398 


36700 


for 


G-protein alpha subunit 
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SEQ ID 


Start 


Stop 


Score 


Direction 


Description 


1909 


160 


258 


6370 


for 


WD domain, G-beta repeats 


1911 


35 


151 


9335 


for 


Zinc finger, C3HC4 type (RING finger) 


1980 


60 


197 


7917 


for 


Zinc finger, C3HC4 type (RING finger) 


2065 


253 


306 


5410 


for 


Zinc finger, CCHC class 


2135 


2 


401 


10596 


for 


ATPases Associated with Various 
Cellular Activities 


2216 


90 


179 


5380 


for 


WW/rsp5/WWP domain containing 
proteins 


2218 


127 


225 


5500 


for 


WD domain, G-beta repeats 


2281 


20 


387 


6044 


for 


Protein Tyrosine Phosphatase 


2282 


183 


353 


5136 


for 


C2 domain (prot. kinase C like) 


2286 


12 


382 


5228 


for 


protein kinase 


2310 


20 


371 


5962 


for 


Protein Tyrosine Phosphatase 


2363 


48 


211 


4132 


for 


Basic region plus leucine zipper 
transcription factors 


2424 


43 


194 


3996 


for 


Basic region plus leucine zipper 
transcription factors 


2428 


25 


350 


4675 


for 


Dual specificity phosphatase, catalytic 
domain 


2562 


18 


101 


4560 


for 


Ank repeat 


2577 


0 


311 


10295 


for 


4 transmembrane segments integral 
membrane proteins 


2591 


60 


165 


4560 


for 


SH2 Domain 


2684 


9 


461 


5759 


for 


ATPases Associated with Various 
Cellular Activities 


2826 


116 


400 


16107 


•for 


DEAD and DEAH box helicases 


2859 


100 


320 


5550 


rev 


ATPases Associated with Various 
Cellular Activities 


2871 


198 


392 


9384 


for 


DEAD and DEAH box helicases 


2944 


18 


281 


10480 


for 


Calpain large subunit, domain III 


2969 


5 


387 


5976 


rev 


protein kinase 


3015 


131 


214 


3600 


for 


Ank repeat 


3047 


191 


292 


5295 


for 


WD domain, G-beta repeats 


3081 


190 


252 


4360 


for 


Zinc finger, C2H2 type 


3108 


275 


367 


5791 


for 


WD domain, G-beta repeats 


3147 


190 


369 


4022 


for 


Basic region plus leucine zipper 
transcription factors 


3152 


129 


320 


3947 


for 


Basic region plus leucine zipper 
transcription factors 


3158 


167 


334 


4180 


for 


Basic region plus leucine zipper 
transcription factors 


3175 


14 


164 


5951 


for 


mkk like kinases 



47 1 
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SEQ ID 


Start 


Stop 


Score 


Direction 


Description 


3175 


8 


112 


5968 


for 


protein kinase 


3178 


45 


386 


19398 


for 


ATPases Associated with Various 
Cellular Activities 


3183 


14 


215 


9133 


for 


4 transmembrane segments integral 
membrane proteins 


3190 


229 


390 


6089 


for 


mkk like kinases 


3190 


118 


390 


8063 


for 


protein kinase 


3193 


293 


355 


3570 


for 


Zinc finger, C2H2 type 


3195 


0 


215 


10146 


for 


4 transmembrane segments integral 
membrane proteins 


3197 


281 


343 


4490 


for 


Zinc finger, C2H2 type 


3208 


34 


256 


4190 


for 


Basic region plus leucine zipper 
transcription factors 


3258 


138 


394 


9877 


for 


Ras family 


3266 


8 


139 


9328 


for 


ATPases Associated with Various 
Cellular Activities 


3267 


97 


180 


3820 


for 


Ank repeat 


3274 


11 


187 


15442 


for 


Fork head domain, eukaryotic 
transcription factors 


3281 


15 


182 


9681 


for 


mkk like kinases 


3285 


16 


102 


4680 


for 


EF-hand 


3292 


208 


300 


5585 


for 


WD domain, G-beta repeats 


3297 


7 


153 


6100 


for 


Helicases conserved C-terminal domain 


3306 


161 


223 


4900 


for 


Zinc finger, C2H2 type 


3307 


43 


321 


8740 


for 


SH2 Domain 


3339 


94 


342 


14970 


for 


SH2 Domain 


3345 


65 


271 


12512 


for 


PDZ domain 


3351 


124 


270 


6068 


for 


Phorbol esters/diacylglycerol binding 
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Example 4 

Differential Expression of Polynucleotides of the Invention: 
Description of Libraries and Detection of Differential Expression 

5 The relative expression levels of the polynucleotides of the invention 

was assessed in several libraries prepared from various sources, including cell lines and 
patient tissue samples. Table 6 provides a summary of these libraries, including the 
shortened library name (used hereafter), the mRNA source used to prepare the cDNA 
library, the abbreviated name of the library that is used in the tables below (in quotes), 
1 0 and the approximate number of clones in the library. 



Table 6 

Description of cDNA Libraries 



Library 
(lib #) 


Description 


Number of 
Clones in 

this 
Clustering 


1 


Kml2L4 

Human Colon Cell Line, High Metastatic Potential 
(derived from Kml2C) 
"High Colon" 


307133 


2 


Kml2C 

Human Colon Cell Line, Low Metastatic Potential 
"Low Colon" 


284755 


3 


MDA-MB-231 

Human Breast Cancer Cell Line, High Metastatic Potential; 
micro-metastases in lung 
"High Breast" 


326937 


4 


MCF7 

Human Breast Cancer Cell, Non Metastatic 
"Low Breast" 


318979 


8 


MV-522 

Human Lung Cancer Cell Line, High Metastatic Potential 
"High Lung" 


223620 


9 


UCP-3 

Human Lung Cancer Cell Line, Low Metastatic Potential 
"Low Lung" 


312503 



mi 
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Library 
(lib #) 


Description 


Number of 
Clones in 

this 
Clustering 


12 


Human microvascular endothelial cells (HMEC) - Untreated 
PCR (OligodT) cDNA library 


41938 


13 


Human microvascular endothelial cells (HMEC) - 
Basic fibroblast growth factor (bFGF) treated 
PCR (OligodT) cDNA library 


42100 


14 


Human microvascular endothelial cells (HMEC) - 
Vascular endothelial growth factor (VEGF) treated 
PCR (OligodT) cDNA library 


42825 


15 


Normal Colon - UC#2 Patient 
PCR (OligodT) cDNA library 
"Normal Colon Tumor Tissue" 


34285 


16 


Colon Tumor - UC#2 Patient 
PCR (OligodT) cDNA library 
"Normal Colon Tumor Tissue" 


35625 


17 


Liver Metastasis from Colon Tumor of UC#2 Patient 
PCR (OligodT) cDNA library 
"High Colon Metastasis Tissue" 


36984 


18 


Normal Colon - UC#3 Patient 
PCR (OligodT) cDNA library 
"Normal Colon Tumor Tissue" 


36216 


19 


Colon Tumor - UC#3 Patient 
PCR (OligodT) cDNA library 
"High Colon Tumor Tissue" 


41388 


20 


Liver Metastasis from Colon Tumor of UC#3 Patient 
PCR (OligodT) cDNA library 
"High Colon Metastasis Tissue" 


30956 


21 


GRRpz 

Human Prostate Cell Line 


164801 


22 


WOca 

Human Prostate Cancer Cell Line 


162088 



The KM12L4 and KM12C cell lines are described in Example 1 above. 
The MDA-MB-231 cell line was originally isolated from pleural effusions (Cailleau, J. 
Natl Cancer, Inst. (1974) 55:661), is of high metastatic potential, and forms poorly 
5 differentiated adenocarcinoma grade II in nude mice consistent with breast carcinoma. 
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The MCF7 cell line was derived from a pleural effusion of a breast adenocarcinoma and 
is non-metastatic. The MV-522 cell line is derived from a human lung carcinoma and is 
of high metastatic potential. The UCP-3 cell line is a low metastatic human lung 
carcinoma cell line; the MV-522 is a high metastatic variant of UCP-3. These cell lines 
5 are well-recognized in the art as models for the study of human breast and lung cancer 
(see, e.g., Chandrasekaran et al., Cancer Res, (1979) 59:870 (MDA-MB-231 and MCF- 
7); Gastpar et al., J Med Chem (1998) 47:4965 (MDA-MB-231 and MCF-7); Ranson et 
al., Br J Cancer (1998) 77:1586 (MDA-MB-231 and MCF-7); Kuang et al., Nucleic 
Acids Res (1998) 26:1116 (MDA-MB-231 and MCF-7); Varki et ah, Int J Cancer 

10 (1987) 40:46 (UCP-3); Varki et al., Tumour Biol (1990) 77:327; (MV-522 and UCP-3); 
Varki et al., Anticancer Res. (1990) 70:637; (MV-522); Kelner et al, Anticancer Res 
(1995) 75:867 (MV-522); and Zhang et al., Anticancer Drugs (1997) 5:696 (MV522)). 
The samples of libraries 15-20 are derived from two different patients (UC#2, and 
UC#3). The bFGF-treated HMEC were prepared by incubation with bFGF at lOng/ml 

15 for 2 hrs; the VEGF-treated HMEC were prepared by incubation with 20ng/ml VEGF 
for 2 hrs. Following incubation with the respective growth factor, the cells were 
washed and lysis buffer added for RNA preparation. The GRRpz cell line refers to low 
passage (3 passages or fewer) human prostate cells, and the WOca cell line refers to low 
passage (3 passages or fewer) human prostate cancer cells. 

20 Each of the libraries is composed of a collection of cDNA clones that in 

turn are representative of the mRNAs expressed in the indicated mRNA source. In 
order to facilitate the analysis of the millions of sequences in each library, the sequences 
were assigned to clusters. The concept of "cluster of clones" is derived from a 
sorting/grouping of cDNA clones based on their hybridization pattern to a panel of 

25 roughly 300 7bp oligonucleotide probes (see Drmanac et al., Genomics (1996) 
57(1):29). Random cDNA clones from a tissue library are hybridized at moderate 
stringency to 300 7bp oligonucleotides. Each oligonucleotide has some measure of 
specific hybridization to that specific clone. The combination of 300 of these measures 
of hybridization for 300 probes equals the "hybridization signature" for a specific clone. 

30 Clones with similar sequence will have similar hybridization signatures. By developing 
a sorting/grouping algorithm to analyze these signatures, groups of clones in a library 
can be identified and brought together computationally. These groups of clones are 
termed "clusters". Depending on the stringency of the selection in the algorithm 
(similar to the stringency of hybridization in a classic library cDNA screening protocol), 

35 the "purity" of each cluster can be controlled. For example, artifacts of clustering may 
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occur in computational clustering just as artifacts can occur in "wet-lab" screening of a 
cDNA library with 400 bp cDNA fragments, at even the highest stringency. The 
stringency used in the implementation of cluster herein provides groups of clones that 
are in general from the same cDNA or closely related cDNAs. Closely related clones 
5 can be a result of different length clones of the same cDNA, closely related clones from 
highly related gene families, or splice variants of the same cDNA. 

Differential expression for a selected cluster was assessed by first 
determining the number of cDNA clones corresponding to the selected cluster in the 
first library (Clones in 1 st ), and the determining the number of cDNA clones 

1 0 corresponding to the selected cluster in the second library (Clones in 2 nd ). Differential 
expression of the selected cluster in the first library relative to the second library is 
expressed as a "ratio" of percent expression between the two libraries. In general, the 
"ratio" is calculated by: 1) calculating the percent expression of the selected cluster in 
the first library by dividing the number of clones corresponding to a selected cluster in 

15 the first library by the total number of clones analyzed from the first library; 
2) calculating the percent expression of the selected cluster in the second library by 
dividing the number of clones corresponding to a selected cluster in a second library by 
the total number of clones analyzed from the second library; 3) dividing the calculated 
percent expression from the first library by the calculated percent expression from the 

20 second library. If the "number of clones" corresponding to a selected cluster in a library 
is zero, the value is set at 1 to aid in calculation. The formula used in calculating the 
ratio takes into account the "depth" of each of the libraries being compared, i.e., the 
total number of clones analyzed in each library. 

In general, a polynucleotide is said to be significantly differentially 

25 expressed between two samples when the ratio value is greater than at least about 2, 
preferably greater than at least about 3, more preferably greater than at least about 5 , 
where the ratio value is calculated using the method described above. The significance 
of differential expression is determined using a z score test (Zar, Biostatistical Analysis. 
Prentice Hall, Inc., USA, "Differences between Proportions," pp 296-298 (1974)). 
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EXAMPLE 5 

Polynucleotides Differentially Expressed in High Metastatic Potential 
Breast Cancer Cells Versus Low Metastatic Breast Cancer Cells 

5 A number of polynucleotide sequences have been identified that are 

differentially expressed between cells derived from high metastatic potential breast 
cancer tissue and low metastatic breast cancer cells. Expression of these sequences in 
breast cancer can be valuable in determining diagnostic, prognostic and/or treatment 
information. For example, sequences that are highly expressed in the high metastatic 

10 potential cells can be indicative of increased expression of genes or regulatory 
sequences involved in the metastatic process. A patient sample displaying an increased 
level of one or more of these polynucleotides may thus warrant more aggressive 
treatment. In another example, sequences that display higher expression in the low 
metastatic potential cells can be associated with genes or regulatory sequences that 

15 inhibit metastasis, and thus the expression of these polynucleotides in a sample may 
warrant a more positive prognosis than the gross pathology would suggest. 

The differential expression of these polynucleotides can be used as a 
diagnostic marker, a prognostic marker, for risk assessment, patient treatment and the 
like. These polynucleotide sequences can also be used in combination with other 

20 known molecular and/or biochemical markers. 

The following tables summarize polynucleotides that are differentially 
expressed between high metastatic potential breast cancer cells and low metastatic 
potential breast cancer cells. 

Table 7 

25 Differentially expressed polynucleotides: Higher expression in 

high metastatic potential breast cancer (lib3) relative to low metastatic 
breast cancer cells (lib4) 



SEQ IDNOs: 


Lib3 clones 


Lib4 clones 


Iib3/lib4 


472 


64 


0 


62 


1851 


6 


0 


6 


1856 


8 


0 


8 


1867 


6 


0 


6 


1872 


6 


0 


6 


1875 


12 


3 


4 


1923 


89 


22 


4 
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SEQ ID NOs: 


Lib3 clones 


Lib4 clones 


Hb3/lib4 


2118 


7 


0 


7 


2119 


7 


0 


7 


2135 


37 


13 


3 


2190 


19 


0 


19 


2193 


16 


5 


3 


2232 


12 


2 


6 


2239 


6 


0 


6 


2338 


21 


2 


10 


2378 


16 


4 


4 


2394 


6 


0 


6 


2395 


6 


0 


6 


2490 


13 


3 


4 


2505 


16 


2 


8 


2540 


8 


1 


8 


2542 


11 


1 


11 


2607 


11 


2 


5 


2640 


22 


5 


4 


2674 


8 


0 


8 


2679 


19 


0 


19 


2684 


14 


4 


3 


2707 


8 


0 


8 


2724 


9 


0 


9 


2757 


6 


0 


6 


2776 


10 


0 


10 


2804 


13 


2 


6 


2818 


6 


0 


6 


2906 


14 


0 


14 


2959 


26 


8 


3 


2964 


17 


4 


4 


2968 


6 


0 


6 


2977 


22 


3 


7 


2980 


13 


1 


13 


3010 


6 


0 


6 


3043 


10 


1 


10 


3071 


33 


12 


3 


3072 


9 


1 


9 


3095 


19 


3 


6 


3097 


11 


2 


5 


3173 


12 


2 


6 


3203 


8 


1 


8 


3210 


27 


8 


3 
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SEQ ID NOs: 


Lib3 clones 


Lib4 clones 


Iib3/lib4 


3212 


13 


1 


13 


3284 


8 


0 


8 


3288 


6 


0 


6 


3331 


14 


3 


5 


3335 


13 


1 


13 



Table 8 

Differentially expressed polynucleotides: Higher expression in 
low metastatic breast cancer cells (lib4) relative to high metastatic 
5 potential breast cancer (lib3) 



SEQ ID NOs: 


Lib 3 Clones 


Lib 4 Clones 


1 *U A /I Tl^T 

lio4/lib3 


402 


0 


6 


6 


614 


3 


21 


/ 


624 


0 


6 


0 


626 


0 


O 

8 


o 
0 


712 


0 


o, 
y 


Q 

y 


744 


0 


1 


7 




z 




1 S 
I J 


1452 


2 


13 


7 


1880 


0 


9 


9 


1915 


0 


7 


7 


1951 


0 


6 


6 


1955 


8 


32 


4 


2015 


0 


7 


7 


2046 


0 


7 


7 


2076 


1 


22 


23 


2087 


0 


6 


6 


2124 


0 


9 


9 


2145 


0 


8 


8 


2162 


0 


6 


6 


2163 


0 


12 


12 


2164 


5 


19 


4 


2172 


2 


15 


8 


2192 


5 


16 


3 


2244 


20 


43 


2 


2266 


3 


18 


6 


2313 


24 


56 


2 


2346 


1 


13 


13 
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SEQ ID NOs: 


Lib 3 Clones 


Lib 4 Clones 


Iib4/lib3 


2355 


0 


10 


10 


2371 


0 


6 


6 


2393 


1 


17 


17 


2404 


1 


21 


22 


2443 


0 


6 


6 


2460 


0 


11 


11 


2523 


0 


6 


6 


2575 


1 


10 


10 


2578 


0 


6 


6 


2584 


1 


17 


17 


2590 


0 


6 


6 


2609 


1 


9 


9 


2632 


5 


24 


5 


2714 


5 


24 


5 


2728 


0 


6 


6 


2752 


1 


14 


14 


2794 


4 


15 


4 


2826 


0 


7 


7 


2987 


5 


15 


3 


3005 


1 


14 


14 


3009 


20 


58 


3 


3047 


4 


17 


4 


3057 


2 


17 


9 


3075 


2 


11 


6 


3076 


0 


6 


6 


3102 


0 


6 


6 


3128 


15 


52 


4 


3132 


15 


52 


4 


3142 


0 


6 


6 


3187 


22 


49 


2 


3253 


23 


96 


4 


3282 


19 


46 


2 


3285 


20 


40 


2 


3346 


0 


9 


9 
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EXAMPLE 6 

Polynucleotides Differentially Expressed in High Metastatic Potential Lung 
Cancer Cells Versus Low Metastatic Lung Cancer Cells 

5 A number of polynucleotide sequences have been identified that are 

differentially expressed between cells derived from high metastatic potential lung 
cancer cells and low metastatic lung cancer cells. Expression of these sequences in lung 
cancer tissue can be valuable in determining diagnostic, prognostic and/or treatment 
information. For example, sequences that are highly expressed in the high metastatic 

10 potential cells can be indicative of increased expression of genes or regulatory 
sequences involved in the metastatic process. A patient sample displaying an increased 
level of one or more of these polynucleotides may thus warrant more aggressive 
treatment. In another example, sequences that display higher expression in the low 
metastatic potential cells can be associated with genes or regulatory sequences that 

15 inhibit metastasis, and thus the expression of these polynucleotides in a sample may 
warrant a more positive prognosis than the gross pathology would suggest. 

The differential expression of these polynucleotides can be used as a 
diagnostic marker, a prognostic marker, for risk assessment, patient treatment and the 
like. These polynucleotide sequences can also be used in combination with other 

20 known molecular and/or biochemical markers. 

The following tables summarize polynucleotides that are differentially 
expressed between high metastatic potential lung cancer cells and low metastatic 
potential lung cancer cells: 
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Table 9 

Differentially expressed polynucleotides: Higher expression in high 
metastatic potential lung cancer cells 0ib8) relative to low 
metastatic lung cancer cells (lib9) 



SFO TD NO* 


Lib8 clones 


Lib9 clones 


Iib8/lib9 


14 


10 


0 


10 


117 


5 


0 


5 


151 


5 


0 


7 


152 


9 


0 


13 


171 


6 


o 


8 


700 
z.uu 


10 


o 


14 


754 


5 


o 


7 




5 


o 


7 


771 

Z- / 1 


5 


o 


7 


14R 


6 


1 


8 


417 


5 


0 


7 


507 


5 


0 


7 


570 


6 


0 


8 


510 

■J J w 


5 


0 


7 


58R 


5 


o 


7 


671 

UZ. J 


7 


o 


10 


617 


7 


0 


10 


660 


5 


0 


7 


678 


g 


0 


11 


680 


5 


0 


7 


700 


9 


2 


6 


714 


28 


13 


3 


774 


11 


0 


15 


812 


5 


0 


7 


834 


8 


2 


6 


901 


11 


2 


8 


1168 


5 


0 


7 


1333 


6 


0 


8 


1352 


5 


0 


7 


1524 


11 


1 


15 


1706 


5 


0 


7 


1752 


17 


9 


3 


1768 


20 


4 


7 


1769 


5 


0 


7 


1780 


6 


0 


8 
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SEQ ID NO: 


Lib8 clones 


Lib9 clones 


Iib8/lib9 


1781 


40 


3 


19 


1799 


6 


1 


8 


1803 


6 


1 


8 


1811 


16 


9 


2 


1884 


6 


0 


8 


1919 


8 


1 


11 


1939 


6 


0 


8 


1975 


43 


9 


7 


2024 


12 


1 


17 


2045 


8 


1 


11 


2060 


20 


13 


2 


2071 


16 


4 


6 


2128 


5 


0 


7 


2177 


10 


2 


7 


2181 


44 


13 


5 


2184 


11 


1 


15 


2185 


10 


4 


3 


2283 


7 


0 


10 


2311 


10 


4 


3 


2314 


10 


0 


14 


2393 


14 


6 


3 


2398 


6 


1 


8 


2460 


10 


4 


3 


2514 


6 


0 


8 


2597 


5 


0 


7 


2657 


8 


2 


6 


2669 


6 


1 


8 


2670 


6 


1 


8 


3047 


21 


3 


10 


3050 


16 


5 


4 


3092 


7 


1 


10 


3140 


181 


119 


2 


3157 


5 


0 


7 


3187 


16 


5 


4 


3210 


5 


0 


7 


3220 


28 


4 


10 


3236 


7 


1 


10 


3249 


16 


0 


22 


3264 


8 


2 


6 


3305 


7 


0 


10 


3309 


20 | 0 


28 
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SEQ ID NO: 


Lib8 clones 


Lib9 clones 


Iib8/lib9 


3318 


24 


4 


8 


3330 


5 


0 


7 


3331 


5 


0 


7 



Table 10 

Differentially expressed polynucleotides: Higher expression in low metastatic lung 
cancer cells (lib 9) relative to high metastatic potential lung cancer cells (lib 8) 



SEQ ID NO: 


Lib 8 clones 


Lib 9 clones 


lib 9/lib 8 


24 


3 


20 


5 


53 


0 


18 


13 


64 


0 


8 


6 


70 


0 


11 


8 


105 


10 


66 


5 


129 


0 


16 


11 


214 


1 


14 


10 


233 


4 


35 


6 


237 


0 


13 


9 


264 


0 


29 


21 


329 


2 


17 


6 


368 


1 


37 


26 


370 


0 


11 


8 


418 


0 


8 


6 


450 


0 


9 


6 


461 


0 


9 


6 


484 


0 


26 


19 


494 


0 


41 


29 


517 


1 


12 


9 


522 


1 


11 


8 


581 


1 


17 


12 


614 


3 


23 


5 


706 


0 


11 


8 


726 


5 


23 


3 


806 


0 


14 


10 


824 


0 


9 


6 


836 


1 


14 


10 


874 


0 


12 


9 


900 


5 


21 


3 


1017 


2 


14 5 
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SEQ ID NO: 


Lib 8 clones 


Lib 9 clones 


lib 9/lib 8 


1144 


0 


8 


6 


1154 


0 


12 


9 


1166 


2 


45 


16 


1170 


1 


13 


9 


1302 


2 


13 


5 


1326 


1 


13 


9 


1327 


1 


13 


9 


1367 


0 


12 


9 


1377 


0 


12 


9 


1437 


2 


18 


6 


1442 


1 


14 


10 


1466 


0 


13 


9 


1476 


0 


13 


9 


1495 


0 


8 


6 


1496 


1 


13 


9 


1664 


38 


253 


5 


1682 


1 


17 


12 


1687 


0 


9 


6 


1758 


0 


8 


6 


1817 


4 


18 


3 


1837 


3 


16 


4 


1845 


3 


23 


5 


1856 


2 


17 


6 


1910 


1 


18 


13 


2146 


2 


16 


9 


2156 


0 


9 


6 


2463 


0 


12 


9 


2724 


10 


38 


3 


2749 


403 


2000 


4 


2801 


6 


25 


3 


2993 


3 


18 


4 


3080 


0 


10 


7 


3107 


3 


23 


5 


3292 


0 


20 


14 


3324 


110 


548 


4 
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EXAMPLE 7 

Polynucleotides Differentially Expressed in High Metastatic Potential 
Colon Cancer Cells Versus Low Metastatic Colon Cancer Cells 

5 A number of polynucleotide sequences have been identified that are 

differentially expressed between cells derived from high metastatic potential colon 
cancer cells and low metastatic colon cancer cells. Expression of these sequences in 
colon cancer tissue can provide diagnostic, prognostic and/or treatment information. 
For example, sequences that are highly expressed in the high metastatic potential cells 

10 can be indicative of increased expression of genes or regulatory sequences involved in 
the metastatic process. A patient sample displaying an increased level of one or more of 
these polynucleotides may thus warrant more aggressive treatment. In another example, 
sequences that display higher expression in the low metastatic potential cells can be 
associated with genes or regulatory sequences that inhibit metastasis, and thus the 

15 expression of these polynucleotides in a sample may warrant a more positive prognosis 
than the gross pathology would suggest. 

The differential expression of these polynucleotides can be used as a 
diagnostic marker, a prognostic marker, for risk assessment, patient treatment and the 
like. These polynucleotide sequences can also be used in combination with other 

20 known molecular and/or biochemical markers. 

The following table summarizes identified polynucleotides with 
differential expression between high metastatic potential colon cancer cells and low 
metastatic potential colon cancer cells: 

Table 1 1 

25 Differentially expressed polynucleotides: Higher expression in low metastatic colon 
cancer cells (lib 2) relative to high metastatic potential colon cancer cells (lib 1) 



SEQ IDNOs: 


Lib 1 clones 


Lib 2 clones 


lib 2/lib 1 


429 


0 


9 


10 


1494 


0 


8 


9 


1923 


34 


114 


4 


1986 


3 


12 


4 


2018 


0 


9 


10 


2036 


2 


10 


5 


2049 


8 


25 


3 


2135 


24 


87 


4 
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SEQ ID NOs: 


Lib 1 clones 


Lib 2 clones 


lib 2/lib 1 


2146 


2 


16 


9 


2208 


6 


27 


5 


2215 


2 


11 


6 


2239 


1 


10 


11 


2307 


2 


12 


6 


2313 


28 


62 


2 


2357 


5 


14 


3 


2360 


3 


21 


8 


2362 


0 


6 


6 


2378 


3 


12 


4 


2569 


3 


20 


7 


2571 


0 


6 


6 


2588 


54 


172 


3 


2592 


15 


41 


3 


2611 


0 


6 


6 


2636 


0 


9 


10 


2641 


7 


20 


3 


2650 


0 


9 


10 


2662 


0 


9 


10 


2674 


4 


13 


4 


2682 


0 


6 


6 


2702 


9 


25 


3 


2704 


8 


23 


3 


2715 


2 


12 


6 


2804 


9 


22 


3 


2821 


13 


29 


2 


2840 


1 


8 


9 


2846 


2 


15 


8 


2866 


0 


6 


6 


2906 


0 


6 


6 


2915 


44 


109 


3 


2933 


0 


6 


6 


2935 


5 


16 


3 


2957 


1 


11 


12 


2959 


3 


27 


10 


2977 


16 


30 


2 


2980 


12 


27 


2 


3000 


2 


13 


7 


3009 


12 


29 


3 


3115 


0 


7 


8 


3156 


502 


2170 


5 
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oty ID fNUS. 


J ih 1 r»1 Anpc 


T ih 9 pinner 


lib 2/lib 1 

111/ A>/ lit/ 1 


3210 


2 


21 


11 


3211 


0 


9 


10 


3213 


0 


7 


8 


3235 


2 


12 


6 


3251 


2 


12 


6 


3296 


3 


12 


4 


3335 


1 


8 


9 



EXAMPLE 8 

Polynucleotides Differentially Expressed in High Metastatic Potential 
Colon Cancer Patient Tissue Versus Normal Patient Tissue 

5 

A number of polynucleotide sequences have been identified that are 
differentially expressed between cells derived from high metastatic potential colon 
cancer tissue and normal tissue. Expression of these sequences in colon cancer tissue 
can provide diagnostic, prognostic and/or treatment information. For example, 

10 sequences that are highly expressed in the high metastatic potential cells can be 
indicative of increased expression of genes or regulatory sequences involved in the 
advanced disease state which involves processes such as angiogenesis, dedifferentiation, 
cell replication, and metastasis. A patient sample displaying an increased level of one 
or more of these polynucleotides may thus warrant more aggressive treatment. 

15 The differential expression of these polynucleotides can be used as a 

diagnostic marker, a prognostic marker, for risk assessment, patient treatment and the 
like. These polynucleotide sequences can also be used in combination with other 
known molecular and/or biochemical markers. 

The following tables summarize polynucleotides that are differentially 

20 expressed between high metastatic potential colon cancer tissue and normal colon 
tissue: 
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Table 12 

Differentially expressed polynucleotides isolated from samples from two patients 
(patient 2 and patient 3 and) : Lower expression in high metastatic potential colon tissue 
(patient 2:lib 17; patient 3:lib 20) vs. normal colon tissue (patient 2:lib 15; patient 
5 3:libl8) 



f-> y-\ TT*X VTA 

SEQ ID NO: 


lib 1 5 clones 


lib 17 clones 


1 • l 1 c /111-. 1 *7 

lib 15/lib 17 


69 


19 


7 


3 


123 


6 


0 


6 


140 


24 


o 

8 


3 


197 


6 


0 


6 


198 


113 


0 


121 


254 


28 


9 


3 


412 


28 


9 


3 


512 


11 


l 


12 


641 


17 


7 


3 


642 


7 


0 


o 

8 


954 


12 


3 


4 


1011 


209 


16 


14 


1024 


8 


0 


9 


1040 


12 


3 


4 


1055 


26 


7 


4 


1 106 


31 


15 


2 


1 125 


17 


0 


15 


1 1 ^ C\ 

1 129 


17 


A 


1 o 

lo 


1 138 


109 


0 


1 17 




1 A 


1 
1 


1 s 

l D 


1253 


73 


0 


78 


1283 


34 


7 


5 


1285 


34 


7 


5 


1339 


13 


4 


3 


1474 


73 


0 


78 


1505 


18 


3 


6 


1553 


68 


6 


12 


1554 


2542 


14 


195 


1605 


2542 


14 


195 


1628 


6 


0 


6 


1643 


142 


4 


38 


1753 


12 


0 


10 


1764 


13 


0 


14 
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SEQ ID NO: 


lib 15 clones 


lib 1 7 clones 


lib 15/lib 17 


SEQ ID NO: 


Lib 18 Clones 


Lib20 Clones 


Iibl8/lib20 


105 


28 


11 


2 


198 


21 


0 


18 


254 


9 


0 


8 


412 


9 


0 


8 


1011 


11 


1 


9 


1138 


14 


0 


12 


1253 


23 


0 


20 


1643 


18 


0 


15 


1764 


12 


0 


10 


3156 


140 


43 


3 



Table 13 

Differentially expressed polynucleotides isolated from samples from two patients 
(patient 2 and patient 3): Lower expression in normal colon tissue (patient 2:lib 15; 
5 patient 3:lib 18)vs. high metastatic potential colon tissue (patient 2:lib 17; patient 3:lib 

20). 



SEQ ID NO: 


Lib 15 Clones 


Lib 17 Clones 


lib 17/lib 15 


321 


3 


23 


7 


363 


1 


9 


8 


836 


21 


99 


4 


859 


6 


20 


3 


885 


13 


28 


2 


916 


13 


28 


2 


981 


2 


11 


5 


1226 


8 


70 


8 


1308 


0 


8 


7 


1317 


29 


84 


3 


1429 


27 


127 


4 


1442 


0 


9 


8 


1534 


1 


12 


11 


1540 


12 


43 


3 


1552 


0 


7 


7 


1556 


1 


9 


8 


1557 


1 


9 


8 


1569 


2189 


5122 


2 


1571 


6 


18 


3 


1576 


3 


25 


8 
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SEQ ID NO: 


Lib 15 Clones 


Lib 17 Clones 


lib 17/lib 15 


1581 


4 


22 


5 


1601 


25 


157 


6 


1613 


9 


48 


5 


1616 


15 


61 


4 


1620 


2 


17 


8 


1622 


4 


99 


23 


1626 


6 


35 


5 


1647 


4 


22 


5 


1664 


4 


28 


7 


1683 


2 


18 


8 


1704 


3 


15 


5 


1800 


0 


7 


7 


2749 


23 


60 


2 


2784 


4 


14 


3 


2805 


1 


9 


8 


2976 


3 


14 


4 


3128 


18 


57 


3 


3129 


26 


124 


4 


3146 


64 


210 


3 


3150 


940 


2267 


2 


3151 


2 


15 


7 










SEQ ID NO: 


lib 1 8 clones 


lib 20 clones 


lib 20/lib 18 


865 


0 


5 


6 


1569 


1 


7 


8 


1580 


1 


7 


8 


1590 


1 


7 


8 


2790 


0 


5 


6 



EXAMPLE 9 

Polynucleotides Differentially Expressed in High Colon Tumor Potential 
Patient Tissue Versus Metastasized Colon Cancer Patient Tissue 
5 A number of polynucleotide sequences have been identified that are 

differentially expressed between cells derived from colon cancer tissue and cells derived 
from colon cancer tissue metastases to liver. Expression of these sequences in colon 
cancer tissue can provide diagnostic, prognostic and/or treatment information associated 
with the transformation of precancerous tissue to malignant tissue. This information 
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can be useful in the prevention of achieving the advanced malignant state in these 
tissues, and can be important in risk assessment for a patient. 

The following table summarizes identified polynucleotides with 
differential expression between high tumor potential colon cancer tissue and cells 
5 derived from high metastatic potential colon cancer cells: 



Table 14 

Differentially expressed polynucleotides: 
Greater expression in metastatic colon tumor tissue (lib 20) vs. 
1 0 colon tumor tissue (lib 1 9) 



SEQ ID NO: 


lib 19 clones 


lib 20 clones 


lib 20/lib 19 


937 


0 


6 


8 


976 


0 


5 


7 


1520 


1 


8 


11 


1546 


1 


11 


15 


1550 


1 


11 


15 


1574 


1 


8 


11 


1580 


0 


7 


9 


1590 


0 


7 


9 


1599 


8 


21 


4 


1607 


158 


632 


5 


1622 


1 


7 


9 



Table 15 

Greater expression in colon tumor tissue (lib 19) than metastatic colon tissue (lib 20) 



SEQ ID NO: 


lib 19 clones 


lib 20 clones 


lib 19/lib 20 


105 


64 


11 


4 


1011 


53 


1 


40 


1226 


18 


4 


3 


1571 


8 


0 


6 


1726 


15 


3 


4 


1811 


17 


2 


6 


2749 


47 


6 


6 


3146 


19 


2 


7 


3324 


20 


1 


15 
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EXAMPLE 10 

Polynucleotides Differentially Expressed in High Tumor Potential 
Colon Cancer Patient Tissue Versus Normal Patient Tissue 
5 A number of polynucleotide sequences have been identified that are 

differentially expressed between cells derived from high tumor potential colon cancer 
tissue and normal tissue. Expression of these sequences in colon cancer tissue can 
provide diagnostic, prognostic and/or treatment information associated with the 
prevention of the malignant state in these tissues, and can be important in risk 
10 assessment for a patient. For example, sequences that are highly expressed in the 
potential colon cancer cells are associated with or can be indicative of increased 
expression of genes or regulatory sequences involved in early tumor progression. A 
patient sample displaying an increased level of one or more of these polynucleotides 
may thus warrant closer attention or more frequent screening procedures to catch the 
1 5 malignant state as early as possible. , 

The following tables summarize polynucleotides that are differentially 
expressed between high metastatic potential colon cancer cells and normal colon cells: 

Table 16 

Differentially expressed polynucleotides detected in samples from patient (patient 2) 
20 Higher expression in normal colon tissue (patient 2, lib 1 5) 

vs. tumor potential colon tissue (patient 2:libl6) 



SEQ ID NO: 


lib 1 5 clones 


lib 1 6 clones 


lib 1 6/lib 1 5 


69 


19 


7 


3 


105 


116 


54 


2 


140 


24 


4 


6 


197 


6 


0 


6 


198 


113 


3 


40 


254 


28 


6 


5 


412 


28 


6 


5 


642 


7 


0 


7 


830 


10 


2 


5 


938 


31 


13 


3 


1011 


209 


37 


6 


1095 


12 


3 


4 


1125 


17 


0 


18 
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SEQ ID NO: 


lib 15 clones 


lib 16 clones 


IID 10/110 15 


1 129 


1 H 

17 


u 


1 ft 


1 1 TO 

1 138 


i no 


i 
i 


1 1 ^ 


1253 


13 


i 
i 


77 


1283 


34 


13 


3 


1285 


1 A 

34 


13 


3 


1339 


13 


3 


5 


1453 


11 


3 


4 


1474 


73 


1 


II 


1505 


18 


6 


3 


1554 


2542 


448 


6 


1605 


2542 


A AO 

448 


0 


1614 


36 


1 A 

14 


3 


1630 


24 


9 


3 


1643 


142 


2 


75 


lo4o 


jy 


1 A 


j 


1649 


24 


8 


3 


1677 


19 


6 


3 


1753 


13 


0 


14 


1764 


13 


0 


14 


1766 


177 


65 


3 


1772 


24 


8 


3 



Table 17 

Differentially expressed polypeptides detected in samples from patient. Lower 
expression in normal colon tissue (lib 18) than colon tumor tissue (lib 19) 



SEQ ID NO: 


lib 1 8 clones 


lib 19 clones 


lib 19/lib 18 


3146 


3 


19 


6 


3150 


21 


228 


10 


3324 


3 


20 


6 
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Table 18 

Differentially expressed polypeptides detected in samples from patient. Higher 
expression in normal colon tissue (lib 18) than colon tumor tissue (lib 19) 



SEQ ID NO: 


lib 18 clones 


lib 19 clones 


lib 18/lib 19 


198 


21 


2 


12 


465 


6 


0 


7 


489 


6 


0 


7 


745 


6 


0 


7 


859 


11 


2 


6 


976 


7 


0 


8 


1011 


209 


37 


6 


1045 


8 


1 


9 


1138 


14 


0 


16 


1253 


23 


0 


26 


1392 


16 


4 


5 


1474 


23 


0 


26 


1589 


6 


0 


7 


1591 


22 


11 


2 


1607 


386 


158 


3 


1643 


18 


0 


21 


1753 


12 


0 


14 


1764 


12 


0 


14 










SEQ ID NO: 


lib 18 clones 


lib 19 clones 


lib 19/hb 18 


105 


28 


64 


2 


1011 


11 


53 


4 


1226 


2 


18 


8 


1251 


6 


19 


3 


1559 


1 


9 


8 


1571 


0 


8 


7 


1608 


1 


9 


8 


1766 


2 


13 


6 


1782 


1 


9 


8 


1811 


1 


17 


15 



5ot> 
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Table 19 

Differentially expressed polynucleotides: 
Higher expression in colon tumor tissue 
(patient 2, lib 16) vs. normal colon tissue (patient 2, lib 15) 



SEQ ID NO: 


lib 1 5 clones 


lib 16 clones 


lib 16/lib 15 


7 


1 


9 


9 


164 


6 


19 


3 


734 


4 


15 


4 


836 


21 


53 


2 


928 


2 


11 


5 


965 


2 


11 


5 


987 


2 


11 


5 


1026 


7 


19 


3 


1044 


4 


16 


4 


1119 


4 


16 


4 


1226 


8 


46 


5 


1227 


0 


9 


9 


1251 


7 


95 


13 


1316 


0 


6 


6 


1429 


27 


81 


3 


1442 


0 


9 


9 


1540 


12 


28 


2 


1553 


68 


590 


8 


1560 


4 


24 


6 


1577 


1 


10 


9 


1588 


5 


20 


4 


1610 


3 


13 


4 


1620 


2 


23 


11 


1626 


6 


23 


4 


1673 


2 


15 


7 


2416 


0 


7 


7 


2749 


23 


54 


2 


2976 


3 


14 


4 


3129 


26 


64 


2 


3132 


18 


54 


3 



9x 
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EXAMPLE 1 1 

Polynucleotides Differentially Expressed in Growth Factor-Stimulated 
Human Microvascular Endothelial Cells (HMEC) Relative to Untreated 

HMEC 

5 A number of polynucleotide sequences have been identified that are 

differentially expressed between human microvascular endothelial cells (HMEC) that 
have been treated with growth factors relative to untreated HMEC. 

Sequences that are differentially expressed between growth factor-treated 
HMEC and untreated HMEC can represent sequences encoding gene products involved 

10 in angiogenesis, metastasis (cell migration), and other developmental and oncogenic 
processes. For example, sequences that are more highly expressed in HMEC treated 
with growth factors (such as bFGF or VEGF) relative to untreated HMEC can serve as 
markers of cancer cells of higher metastatic potential. Detection of expression of these 
sequences in colon cancer tissue can provide diagnostic, prognostic and/or treatment 

15 information associated with the prevention of achieving the malignant state in these 
tissues, and can be important in risk assessment for a patient. A patient sample 
displaying an increased level of one or more of these polynucleotides may thus warrant 
closer attention or more frequent screening procedures to catch the malignant state as 
early as possible. 

20 The following table summarizes identified polynucleotides with 

differential expression between growth factor-treated and untreated HMEC. 

Table 20 

Differentially expressed polynucleotides: 
25 Higher expression in untreated HMEC (lib 12) vs. bFGF treated HMEC (lib 13) 



SEQ ID NO: 


lib 12 clones 


lib 13 clones 


lib 12/lib 13 


849 


6 


0 


6 


1059 


6 


0 


6 


1206 


12 


2 


6 


3208 


12 


0 


12 
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Lower expression in untreated HMEC (lib 12) vs. bFGF treated HMEC (lib 13) 



2748 


3 


12 


4 


3325 


0 


6 


6 



Table 21 

Differentially expressed polynucleotides: 
Higher expression in untreated HMEC (lib 12) VEGF treated HMEC (lib 14) 



SEQ ID NO: 


lib 12 clones 


lib 1 4 clones 


lib 12/lib 14 


1150 


9 


0 


9 



Lower expression in untreated HMEC (lib 12) vs. VEGF treated HMEC (Hbl4) 



3324 



22 



50 



10 



15 



EXAMPLE 12 

Polynucleotides Differentially Expressed in Normal Prostate Cells 
Relative to Prostate Cancer Cells 
A number of polynucleotide sequences have been identified that are 
differentially expressed between cells derived from normal prostate cells and prostate 
cancer cells. Expression of these sequences prostate tissue suspected of being 
cancerous can provide diagnostic, prognostic and/or treatment information. These 
polynucleotide sequences can also be used in combination with other known molecular 
and/or biochemical markers. The following table summarizes identified 
polynucleotides with differential expression between high metastatic potential colon 
cancer cells and low metastatic potential colon cancer cells: 



20 



^0% 
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Table 22 

Differentially expressed polynucleotides: normal prostate cell line (lib 21) 
vs. prostate cancer cell line (lib 22) 
Higher in lib 21 



SEQIDNO: 


lib 21 clones 


lib 22 clones 


lib 21/lib 22 


53 


17 


2 


8 


1754 


22 


8 


3 


1801 


7 


0 


7 


1845 


22 


6 


4 


446 


8 


0 


8 


1410 


6 


0 


6 


2060 


18 


6 


3 


2143 


12 


3 


4 


2632 


13 


1 


13 


2899 


16 


2 


8 


3338 


12 


2 


6 



Higher in lib 22 



86 


2 


13 


7 


93 


0 


9 


9 


687 


0 


9 


9 


1269 


1 


15 


15 


1581 


25 


74 


3 


1647 


25 


74 


3 


1649 


12 


27 


2 


1710 


5 


16 


3 


1717 


5 


16 


3 


1772 


12 


27 


2 


1960 


0 


6 


6 


2987 


0 


6 


6 


3128 


13 


42 


3 


3132 


13 


42 


3 


3150 


263 


962 


4 


3222 


0 


6 


6 


3268 


0 


6 


6 



5o«f 
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EXAMPLE 13 

Polynucleotides Differentially Expressed Across Multiple Libraries 

A number of polynucleotide sequences have been identified that are 
differentially expressed between cancerous cells and normal cells across two or more 
5 tissue types tested (i.e., breast, colon, lung, and prostate). Expression of these 
sequences in a tissue of any origin can provide diagnostic, prognostic and/or treatment 
information associated with the prevention of achieving the malignant state in these 
tissues, and can be important in risk assessment for a patient. These polynucleotides 
can also serve as non-tissue specific markers of, for example, risk of metastasis of a 

10 tumor. The following polynucleotides were differentially expressed but without tissue 
type-specificity in at least two of the breast, colon, lung, and prostate libraries tested: 
53, 105, 355, 412, 614, 836, 1442, 1581, 1647, 1649, 1664, 1772, 1782, 1811, 1845, 
1856, 1875, 1923, 2060, 2071, 2135, 2146, 2239, 2313, 2378, 2393, 2416, 2460, 2490, 
2632, 2674, 2704, 2724, 2749, 2784, 2804, 2959, 2976, 2977, 2980, 2987, 3009, 3047, 

15 3128, 3129, 3132, 3146, 3150, 3156, 3210, 3324, 3331, and 3335. 

Those skilled in the art will recognize, or be able to ascertain, using not 
more than routine experimentation, many equivalents to the specific embodiments of 
the invention described herein. Such specific embodiments and equivalents are 
intended to be encompassed by the following claims. 

20 All publications and patent applications cited in this specification are 

herein incorporated by reference as if each individual publication or patent application 
were specifically and individually indicated to be incorporated by reference. The 
citation of any publication is for its disclosure prior to the filing date and should not be 
construed as an admission that the present invention is not entitled to antedate such 

25 publication by virtue of prior invention. 

Although the foregoing invention has been described in some detail by 
way of illustration and example for purposes of clarity of understanding, it is readily 
apparent to those of ordinary skill in the art in light of the teachings of this invention 
that certain changes and modifications may be made thereto without departing from the 

30 spirit or scope of the appended claims. 

Deposit Information: 

The following materials were deposited with the American Type Culture 
Collection (ATCC); CMCC = Chiron Master Culture Collection: 



WO 01/02568 PCT7US00/18374 



cDNA Libraries Deposited with ATCC 



Tube Number 


Deposit Date 


ATCC 

Accession No. 


CMCC 

Accession No. 


ESI 37 


May 30, 2000 






ESI 38 


May 30, 2000 






ESI 39 


May 30, 2000 






ESI 40 


May 30, 2000 






ES141 


May 30, 2000 






ESI 42 


May 30, 2000 






ESI 43 


May 30, 2000 






ESI 44 


May 30, 2000 






ESI 45 


May 30, 2000 






ESI 46 


May 30, 2000 






ESI 47 


May 30, 2000 






ESI 48 


May 30, 2000 






ESI 49 


May 30, 2000 






ESI 50 


May 30, 2000 






ES151 


May 30, 2000 






ESI 52 


May 30, 2000 






ES153 


May 30, 2000 






ESI 54 


May 30, 2000 






ESI 55 


May 30, 2000 






ESI 56 


May 30, 2000 






ESI 57 


May 30, 2000 






ES158 


May 30, 2000 






ESI 59 


May 30, 2000 






ESI 60 


May 30, 2000 






ES161 


May 30, 2000 






ESI 62 


May 30, 2000 






ESI 63 


May 30, 2000 






ESI 64 


May 30, 2000 






ESI 65 


May 30, 2000 






ESI 66 


May 30, 2000 






ESI 67 


May 30, 2000 







Table 23 lists the clones for each deposit, designated as "tube" number. 
5 This deposit is provided merely as convenience to those of skill in the art, and is not an 
admission that a deposit is required under 35 U.S.C. §112. The sequence of the 
polynucleotides contained within the deposited material, as well as the amino acid 
sequence of the polypeptides encoded thereby, are incorporated herein by reference and 
are controlling in the event of any conflict with the written description of sequences 
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herein. A license may be required to make, use, or sell the deposited material, and no 
such license is granted hereby. 

Retrieval of Individual Clones from Deposit of Pooled Clones 

Where the ATCC deposit is composed of a pool of cDNA clones, the 
5 deposit was prepared by first transfecting each of the clones into separate bacterial cells. 
The clones were then deposited as a pool of equal mixtures in the composite deposit. 
Particular clones can be obtained from the composite deposit using methods well 
known in the art. For example, a bacterial cell containing a particular clone can be 
identified by isolating single colonies, and identifying colonies containing the specific 

10 clone through standard colony hybridization techniques, using an oligonucleotide probe 
or probes designed to specifically hybridize to a sequence of the clone insert (e.g., a 
probe based upon unmasked sequence of the encoded polynucleotide having the 
indicated SEQ ID NO). The probe should be designed to have a T m of approximately 
80°C (assuming 2°C for each A or T and 4°C for each G or C). Positive colonies can 

15 then be picked, grown in culture, and the recombinant clone isolated. Alternatively, 
probes designed in this manner can be used to PCR to isolate a nucleic acid molecule 
from the pooled clones according to methods well known in the art, e.g., by purifying 
the cDNA from the deposited culture pool, and using the probes in PCR reactions to 
produce an amplified product having the corresponding desired polynucleotide 

20 sequence. 

Table 23 





M0O0~O1351A:BO2 j ES 137 


M00001356A:H11 S ES 137 


M0O0OI363D:DO9 i ES 137 


M00001395D:H02 


ES 137 


M00001439C:H06 


ES 137 


M00001476B:G10 


ES 137 


M0O0O1582A:E02 


ES 137 


M00003750D:E06 


ES 137 


M00003761C:F02 


ES 137 


M00003770A:E05 


ES 137 


M00003786A:A11 


ES 137 


M00003800A:F09 


ES 137 


M00003816D:E1I 


ES 137 


M00003902A:C03 


ES 137 


M00003991C:F06 


ES 137 



ilflpgNamfgrf 




M00003995B:E03 


ES 137 


M00004046C:A08 


ES 137 ! 


M00004105D:D05 


ES 137 


M00004139B:B10 


ES 137 i 


M00004140D:C03 


ES 137 


M00004144A:H05 


ES 137 


M00004152A:C12 


ES 137 


M00004155D:A10 


ES 137 


M00004168A:G1I 


ES 137 


M00004197B:H10 


ES 137 | 


M00004222C:E03 


ES 137 j 


M00004234A:E07 


ES 137 i 


1^000042396^1 1 


ES 137 | 


M00004241B:H07 


ES 137 j 


M00004264B:A05 


iES 137" ! 



^1 
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; -v^ ABIOTIC INoiHC . 


>*,-.; 1 UQC v . 




£ - ^iOnc INoIOC v^T; 

,,'*;,"■" 




\yf0000il77 0 . A - FOG 
1V1 UUUUH-Z /on.rUy 


co 1 77 
CO 1 J / 




A/IAAA7£GfiAA •FlAO 

iviuuuzoyoUA.i/uy 


PC 1 m 
co 1 3 / 


IVIUUUUHZoZly.w 1 ( 


PC 117 
CO 10/ 




\A 0007 70 1 & A »Pi0A 
[V1UUUZ /U 1 OA. DUO 


PC 1 77 
CO 13/ 


MoooO/iinRr'.r'nA 

IviUUUU^f 3Uo^ .l^-UO 


pc 1 17 

CO I J / 




\>f0007701 8 A *P00 

MUUUZ /u 1 oA.uuy 


PC 1 77 
CO 13 / 


\/fooooziizior , -r , 07 

MUUUU'IjtUV^.v./U / 


CO 1 J 1 




\y!AAA77A71 A -C\(\1 
iV\\)\J\)Z /UZ 1 A.UUZ 


CC 1 17 

to 13 / 


\A 000041 ^ylH' PO^ 


pc 1 17 
CO 13 1 




\>inoo77077r^-n 1 1 

MUUUZ /UzZU.LJ 1 1 


CC 1 17 

bo 13 / 


moooo^iiai a »uo7 

MUUUU<00 1 A.nUZ 


CC 1 77 

Co J 3 / 




\A AA A7 7 A7 AO • LJ A< 

MUUUz /U3UCHUO 


CC 117 

bo 13 / 


MUUUU43 /ZD.r U / 


CC 1 7 *7 

co 13 / 




Ayl AAAT7A7 <r\-OA/C 

IVIUUUz /U3!)U.CUo 


cc 1 n 
bo 13 / 


\/f A A AA/1 1 7 C A • Q 1 A 

MUUUU43 /o A.t> 1 U 


CC 1 7 7 

co 13 / 




AyTAAA77A/1QD.Cnc 


cc t n 
bo 137 


N jf A A A C\A 1 Q 7 U . C A 7 
IylUUUU43y3t>:cU / 


CC 1 "27 

co 13 / 




XylAAAIlAlO A ,DA1 

MUUUz /U /oA:BUz 


cc m 

bo 13/ 


muuuz3zozA:l.Uz 


cc in 
co 13 / 




MUUUz7UoUA:BU I 


C C ITT 

cS 137 


\/f AAA777AA^\.^ , i 1 


cc in 
co 13 / 




M00027085C:E11 


bb 137 


\viaaa77 7 1 ^•nnii 
MUUUZ33 i OL.uUo 


CC 1 77 

co 1 3 / 




M00027094A:B03 


cc in 

bb 137 


\/tnAA77i77Tvr ,, i 7 
IViUUUZ3333D.dz 


cc in 
co 13 / 




M00027103B:A09 


t?c m 

bo 137 


MUUUZ J 3 _>Z d I r U3 


cc 1 n 
co 13 / 




M00027108C:B03 


c c in 
bo 137 


1V/TAAA77 7<C7rVLJA7 

MUUUZ J J >Z U. riU J 


cc 1 in 
co 13 / 




M00027121D:C05 


C C ITT 

bb 137 


K/l AA A7 7 7 7/; D • r" A/1 
IVIUUUZjJ /OO.OU4 


CC 177 

co 13 / 




M00027135A:B11 


cc m 

bb 137 


\j!AAA77777T>.CA 1 


cc in 

bo 13 / 




M00027136C:C09 


ES 137 


MUUUZ33yofc>.L> 1 z 


cc in 
bo 137 


jM00027l41C:H03 


c c in 

ES 137 


N/fAAA77 7QQ/~*.C 1 A 

Muuuz33yyc: b I u 


cc m 
bo 137 


!M00027159D:F03 


ES 137 


\/f AAA7<CQA7 A .CAQ 

MUUUzooU3A.rUo 


cc m 
bo 13 / 


jM00027162B:F05 


c c 1 n 

ES 137 


\a(\(\(\i&qa 1n.n1 a 
MUUUzOo43b.LMU 


CC 1 "2 7 

bo 13 / 


|M00027178B:G09 


C C ITT 

ES 137 


\/IAAA7<Q< APi.CAG 

Muuuzoojuu.ruy 


CC 1 in 

bo 13 / 




M00027179D:E06 


r? 0 1 10 t 

bb 138 ; 


\/IAAA7/CQ< 1 D • FA 1 

IViUUUZOoj J D.rUl 


cc m 
bo 13/ 




M0002718ID:A05 


— i 

C C 1 T O 

ES 138 


\y4AAA7/^0^£n* 17A7 
MUUUZOojuU.rUZ 


cc 1 in 
bo 1 3 / 




M00027195C:E04 


c c no 

bb 138 


MUUUZOo J / u.o 1 z 


CC 17 7 

co 13 / 




M00027198B:B08 


bb 138 


\yfAAA7AQ<Ql"VPtA 1 
MUUUZoo JyU.UU I 
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ftl00043015D:D05 IBS 167 


M00043016B:F09 jES 167 


M000430l7C:D08 


ES 167 


M00043063C:H05 


ES 167 


M00043070A:C03 


lES 167 


M00043113C:G0'9 


ES167 "J 





mam 


M00042617B:E01 


ES 167 


M00043074C:D07 


ES 167 


\100043076D:A02 


ES 167 


M00043077B:F1 1 


ES 167 


M00043077C:D12 


ES 167 


^000430770:010 


ES 167 


,M00043099A:H04 


ES 167 


p00043101D:Gll 


ES 167 


M00043134A:F05 


ES 167 


M00043152C:B10 


ES 167 ] 


1^0004321 3 A:D05 


ES 167 


M00043219"C:C02 


ES 167 


M00043221D:C12 


ES 167 


M00043222C:B06 


|ES 167 


M00043455B-.C08 


ES 167 


M00043465C:HU 


ES167 "1 


M00043470A:C10 


ES 167 


jM00043485C:C03 ;ES 167 


M00043490C:F02 jES 167 


M00043495C:H05 |ES 167 


M60043528"A:E11 jES 167 


M00043M9A:B08 iES T67 


M006"43640A:B01 jES 167 
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CLAIMS 

We claim: 

1 . A library of polynucleotides, the library comprising the sequence 
information of at least one of SEQ ID NO: 1-3351. 

2. The library of claim 1 , wherein the library is provided on a nucleic 

acid array. 

3. The library of claim 1 , wherein the library is provided in a 
computer-readable format. 

4. The library of claim 1 , wherein the library comprises a 
polynucleotide corresponding to a gene differentially expressed in a cancer cell of high 
metastatic potential relative to a control cell, wherein the control cell is a normal cell or a 
cell of low metastatic potential, wherein the expression is greater in the metastatic tissue, 
and wherein the sequence is selected from the group consisting of SEQ ID NOs:14, 137, 
151, 152, 171, 200, 254, 262, 271, 348, 412, 472, 507, 520, 530, 588, 623, 637, 660, 678, 
680,700,714,774,812,834,901,937,976, 1168, 1333, 1352, 1520, 1524, 1546, 1550, 
1574, 1580, 1590, 1599, 1607, 1622, 1706, 1752, 1768, 1769, 1780, 1781, 1799, 1803, 
1811, 1851, 1856, 1867, 1872, 1875, 1884, 1919, 1923, 1939, 1975, 2024, 2045,2060, 
2071, 21 18, 2119, 2128, 2135, 2177, 2181, 2184, 21 85, 2190, 2193, 2232, 2239, 2283, 
231 1, 2314, 2338, 2378, 2393, 2394, 2395, 2398, 2460, 2490, 2505, 2514, 2540, 2542, 
2597, 2607, 2640, 2657, 2669, 2670, 2674, 2679, 2684, 2707, 2724, 2757, 2776, 2804, 
2818, 2906, 2959, 2964, 2968, 2976, 2980, 2987, 3010, 3043, 3047, 3050, 3071, 3072, 
3092, 3095, 3097, 3140, 3157, 3173, 3187, 3203, 3210, 3212, 3220, 3236, 3249, 3264, 
3284, 3288, 3305, 3309, 3318, 3330, 3331, and 3335. 

5. The library of claim 1 , wherein the library comprises a 
polynucleotide corresponding to a gene differentially expressed in normal colon tissue 
relative to colon cancer tissue, wherein the expression is greater in the cancer tissue, and 
wherein the sequence is selected from the group consisting of SEQ ID NOs:7, 1 64, 734, 
836, 928,965, 987, 1026, 1044, 1119, 1226, 1227, 1251, 1316, 1429, 1442, 1540, 1553, 
1560, 1577, 1588, 1610, 1620, 1626, 1673, 2416, 2749, 2976, 3129 and 3132. 

5V/ 
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6. The library of claim 1 , wherein the library comprises a 
polynucleotide corresponding to a gene differentially expressed in normal colon tissue 
relative to colon cancer tissue, wherein the expression is greater in normal tissue than 
cancer tissue, and wherein the sequence is selected from the group consisting of SEQ ID 
NOs:105, 198,465,489,745,859,976, 1011, 1045, 1138, 1226, 1251, 1253, 1392, 1474, 
1559, 1571, 1589, 1591, 1607, 1608, 1643, 1753, 1764, 1766, 1782, 1811,2749, 2784, 
2790, 2805, 2976, 3128, 3129, 3146, 3150, and 3151. 

7. The library of claim 1 , wherein the library comprises a 
polynucleotide corresponding to a gene differentially expressed in normal human 
prostate cells relative to human prostate cancer cells, wherein the expression is greater 
in normal cells than cancer cells, and wherein the sequence is selected from the group 
consisting of SEQ IDNOs:53, 446, 1410, 1754, 1801, 1845, 2060, 2143, 2632, 2899, 
and 3338. 

8. The library of claim 1 , wherein the library comprises a 
polynucleotide corresponding to a gene differentially expressed in normal human 
prostate cells relative to human prostate cancer cells, wherein the expression is greater 
in cancer cells than normal cells, and wherein the sequence is selected from the group 
consisting of SEQ ID NOs:86, 93, 687, 1269, 1581, 1647, 1649, 1710, 1717, 1772, 
1960, 2987, 3128, 3132, 3150, 3222, and 3268. 

9. An isolated polynucleotide comprising a nucleotide sequence 
having at least 90% sequence identity to an identifying sequence of SEQ ID NOs: 1-3351 or 

a degenerate variant or fragment thereof. 

10. A recombinant host cell containing the polynucleotide of claim 9; 

11. An isolated polypeptide encoded by the polynucleotide of claim 9. 

1 2. An antibody that specifically binds a polypeptide of claim 1 1 . 

1 3. A vector comprising the polynucleotide of claim 9. 

1 4. A method of detecting differentially expressed genes correlated 
with a cancerous state of a mammalian cell, the method comprising the step of: 

detecting at least one differentially expressed gene product in a test sample 
derived from a cell suspected of being cancerous, wherein the gene product is encoded by a 
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gene corresponding to a sequence of at least one of SEQ ID NOs: 1 4, 1 37, 1 5 1 , 1 52, 1 7 1 , 
200, 254, 262, 271, 348, 412, 472, 507, 520, 530, 588, 623, 637, 660, 678, 680, 700, 714, 
774, 812, 834, 901, 937, 976, 1 168, 1333, 1352, 1520, 1524, 1546, 1550, 1574, 1580, 
1590, 1599, 1607, 1622, 1706, 1752, 1768, 1769, 1780, 1781, 1799, 1803, 1811, 1851, 
1856, 1867, 1872, 1875, 1884, 1919, 1923, 1939, 1975, 2024, 2045, 2060, 2071, 2118, 
2119, 2128, 2135, 2177, 2181, 2184, 2185, 2190, 2193, 2232, 2239, 2283, 2311, 2314, 
2338, 2378, 2393, 2394, 2395, 2398, 2460, 2490, 2505, 2514, 2540, 2542, 2597, 2607, 
2640, 2657, 2669, 2670, 2674, 2679, 2684, 2707, 2724, 2757, 2776, 2804, 2818, 2906, 
2959, 2964, 2968, 2976, 2980, 2987, 3010, 3043, 3047, 3050, 3071, 3072, 3092, 3095, 
3097, 3140, 3157, 3173, 3187, 3203, 3210, 3212, 3220, 3236, 3249, 3264, 3284, 3288, 
3305, 3309, 3318, 3330, 3331, and 3335. 

wherein detection of the differentially expressed gene product is correlated with 
a cancerous state of the cell from which the test sample was derived. 

15. A method of detecting differentially expressed genes correlated 
with a cancerous state of a mammalian cell, the method comprising the step of: 

detecting at least one differentially expressed gene product in a test 
sample derived from a cell suspected of being cancerous, wherein the gene product is 
encoded by a gene corresponding to a sequence of at least one of SEQ ID NOs:7, 164, 
734, 836, 928, 965, 987, 1026, 1044, 1119, 1226, 1227, 1251, 1316, 1429, 1442, 1540, 
1553, 1560, 1577, 1588, 1610, 1620, 1626, 1673, 1960, 2416, 2749, 2976, 2987, 3128, 
3129, 3132, 3150, 3222, and 3268. 

wherein detection of the differentially expressed gene product is correlated with 
a cancerous state of the cell from which the test sample was derived. 



